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" PREFACE 


THE MATRIX CALCULUS is widely applied nowadays in various branches of 
mathematics, mechanics, theoretical physics, theoretical electrical engineer- 
ing, etc. However, neither in the Soviet nor the foreign literature is there a 
book that gives a sufficiently complete account of the problems of matrix 
theory and of its diverse applications. The present book is an attempt to fill 
this gap in the mathematical literature. 

The book is based on lecture courses on the theory of matrices and its 
applications that the author has given several times in the course of the last 
seventeen years at the Universities of Moscow and Tiflis and at the Moscow 
Institute ef Physical Technology. 

The book is meant not only for mathematicians (undergraduates and 
research students) but also for specialists in allied fields (physics, engi- 
neering) who are interested in mathematics and its applications. Therefore 
the author has endeavoured to make his account of the material as accessible 
as possible, assuming only that the reader is acquainted with the theory of 
determinants and with the usual course of higher mathematics within the 
programme of higher technical education. Only a few isolated sections in 
the last chapters of the book require additional mathematical knowledge on 
the part of the reader. Moreover, the author has tried to keep the indi- 
vidual chapters as far as possible independent of each other. For example, 
Chapter V, Functions of Matrices, does not depend on the material con- 
tained in Chapters II and III. At those places of Chapter V where funda- 
mental concepts introduced in Chapter IV are being used for the fiist time, 
the corresponding references are given. Thus. a reader who is acquainted 
with the rudiments of the theory of matrices can immediately begin with 
reading the chapters that interest him. 

The book consists of two parts, containing fifteen chapters. 

In Chapters I and ITI, information about matrices and linear operators 
is developed ab initio and the connection between operators and matrices 
is introduced. 

Chapter II expounds the theoretical basis of Gauss’s eliminatio.: method 
and certain associated effective methods of solving a system of m linear 
equations, for large n. In this chapter the reader also becomes acquainted 
with the technique of operating with matrices that are divided into rectan- 
gular ‘blocks.’ # ; 
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In Chapter IV we introduce the extremely important ‘characteristic’ 
and ‘minimal’ polynomials of a square matrix, and the ‘adjoint’ and ‘ reduced 
adjoint’ matrices. 

In Chapter V. which is devoted to functions of matrices, we give the 
general definition of f(A) as well as concrete methods of computing it—- 
where f(A) is a function of a scalar argument 4 and .A is a square matrix. 
The concept of a function of a matrix is used in $$ 5 and 6 of this chapter 
for a complete investigation of the solutiuns of a svstem of linear differen- 
tial equations of the first order with constant coefficients. Both the concept 
of a function of a matrix and this latter investigation of differential equa- 
tions are based entirely on the concept of the minimal polynomial of a matrix 
and—in contrast to the usual exposition—-do not use the so-called theory of 
elementary divisors, which is treated in Chapters VI and VIF. 

These five chapters constitute a first course on matrices and their applhi- 
eations. Very important problems in the theory of matrices arise in con- 
nection with the reduction of matrices to a normal form. This reduction 
is carried out on the basis of Weierstrass’ theory of clementary divisors. 
In view of the importance of this theory we give two expositions in this 
book: an analytic one in Chapter VI and a geometric one in Chapter VII. 
We draw the reader’s attention to §§ 7 and 8 of Chapter VI, where we study 
effective methods of finding a matrix that transforms a given matrix to 
normal form. In $8 of Chapter VIT we investigate in detail the method 
of A. N. Krylov for the practical computation of the coefficients of the 
characteristic polynomial. 

In Chapter VIIT certain types of matrix equations are solved. We also 
consider here the problema of determining all the matrices that are permutable 
with a given matrix and we study in detail the many-valued functions of 
matrices "\/A and In. 

Chapters IX and X deal with the theory of linear operators in a unitary 
space and the theory of quadratic and hermitian forms. These chapters do 
not depend on Weierstrass’ theory of elementary divisors and use, of the 
preceding material, only the basie information on matrices and linear opera- 
tors contained in the first three chapters of the hook. In § 9 of Chapter X 
we apply the theory of forms to the study of the principal oscillations of a 
system with n degrees of freedom. In § 11 of this chapter we give an account 
of Frobenius’ deep results on the theory of Hankel forms. These results are 
used later, in Chapter XV, to study special cases of the Routh-Hurwitz 
problem. 

The last five chapters form the second part of the book [the second 
volume, in the present English translation]. In Chapter XI we determine 
normal forms for complex symmetric, skew-symmetric, and orthogonal mat- 
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rices and.establish interesting connections of these matrices with real matrices 
of the same classes and with unitary matrices. 

In Chapter XII we expound the general theory of pencils of matrices of 
the form A + AB, where A and B are arbitrary rectangular matrices of the 
same dimensions. Just as the study of regular pencils of matrices A + AB 
is based on Weierstrass’ theory of elementary divisors, so the study of singu- 
lar pencils is built upon Kronecker’s theory of minimal indices, which is, as 
it were, a further development of Weierstrass’s theory. By means of Kron- 
ecker’s theory—the author believes that he has succeeded in simplifying the 
exposition of this theory—we establish in Chapter XII canonical forms of 
the pencil of matrices A + AB in the most general case. The results obtained 
there are applied to the study of systems of linear differential equations 
with constant coefficients. 

In Chapter XIII we explain the remarkable spectral properties of mat- 
rices with non-negative elements and consider two important applications 
of matrices of this class: 1) homogeneous Markov chains in the theory of 
probability and 2) oscillatory properties of elastic vibrations in mechanics. 
The matrix method of studying homogeneous Markov chains was developed 
in the book [46] by V. I: Romanovskii and is based on the fact that the matrix 
of transition probabilities in a homogeneous Markov chain with a finite 
number of states is a matrix with non-negative elements of a special type 
(a ‘stochastic’ matrix). 

The oscillatory properties of elastic vibrations are connected with another 
important class of non-negative matrices—the ‘oscillation matrices.’ These 
matrices and their applications were studied by M. G. Krein jointly with 
the author of this book. In Chapter XIIJ, only certain basic results in this 
domain are presented. The reader can find a detailed account of the whole 
material.in the monograph [17]. 

In Chapter XIV we compile the applications of the theory of matrices 
to systems of differential equations with variable coefficients. The central 
place (§§ 5-9) in this chapter belongs to the theory of the multiplicative 
integral (Produktintegral) and its connection with Volterra’s infinitesimal 
caleulus: These problems are almost entirely unknown in Soviet mathe- 
matical literature. In the first sections and in § 11, we study reducible 
systems (in the sense of Lyapunov) in connection with the problem of stabil- 
ity of motion; we also give certain results of N. P. Erugin. Sections 9-11 
refer to the analytic theory of systems of differential equations. Here we 
clarify an inaccuracy in Birkhoff’s fundamental theorem, which is usually 
applied to the investigation of the solution of a system of differential equa- 
tions in the neighborhood of a singular point, and we establish a canonical 
form of the solution in the case of a regular singular point. 
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In §12 of Chapter XIV we give a brief survey of some results of the 
fundamental investigations of I. A. Lappo-Danilevskit on analytic functions 
of several matrices and their applicatious to differential systems. 

The last chapter, Chapter XV, deals with the applications of the theory 
of quadratic forms (in particular, of Hankel forms) to the Routh-Hurwitz 
problem of determining the number of roots of a polynomial in the right 
half-plane (Rez > 0). The first secticns of the chapter contain the classical 
treatment of the problem. In § 5 we give the theorem of A. M. Lyapunov in 
which a stability criterion is set up which is equivalent to the Routh-Hurwitz 
criterion. Together with the stability criterion of Routh-Hurwitz we give, 
in § 11 of this chapter, the comparatively little known criterion of Liénard 
and Chipart in which the number of determinant inequalities is only about 
half of that in the Routh-Hurwitz criterion. 

At the end of Chapter XV we exhibit the close connection between stabil- 
ity problems and two remarkable theorems of A. A. Markov and P. L. 
Chebyshev, which were obtained by these celebrated authors on the basis of the 
expansion of certain continued fractions of special types in series of decreas- 
ing powers of the argument. Here we give a matrix proof of these theorems. 

This, then, is a brief summary of the contents of this book. 


F. R. Gantmacher 
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CHAPTER XI 


COMPLEX SYMMETRIC, SKEW-SYMMETRIC, AND 
ORTHOGONAL MATRICES 


In Volume I, Chapter IX, in connection with the study of linear operators 
in a euclidean space, we investigated real symmetric, skew-symmetric, and 
orthogonal matrices, i.e., real square matrices characterized by the relationst. 


St= 8, K'=— K, and QT=Q-}, 


respectively (here Q™ denotes the transpose of the matrix Q). We have 
shown that in the field of complex numbers all these matrices have linear 
elementary divisors and we have set up normal forms for them, 1.e., ‘simplest’ 
real symmetric, skew-symmetric, and orthogonal matrices to which arbitrary 
matrices of the types under consideration are real-similar and orthogonally 
similar. 

The present chapter deals with the investigation of complex symmetric, 
skew-symmetric, and orthogonal matrices. We shall clarify the question 
of what elementary divisors these matrices can have and shall] set up normal 
forms for them. These forms have a considerably more complicated struc- 
ture than the corresponding normal forms in the real case. Asa preliminary, 
we shall establish in the first section interesting connections between com- 
plex orthogonal and unitary matrices on the one hand, and real symmetric, 
skew-symmetric, and orthogonal matrices on the other hand. 


§ 1. Some Formulas for Complex Orthogonal and Unitary Matrices 


1. We begin with a lemma: 7 
Lemma 1:1 1. If a matriz G is both hermitian and orthogonal (G7 =G= 
G-1), then it can be represented in the form 


G=Ie'X, (1) 
where 1 is a real symmetric mvolutory matric and K a real skew-symmetric 
matrix permutable with tt: 


1 See [169], pp. 223-225. 
{+ In this and in the following chapters, a matrix denoted by the letter Q is not neces- 
sarily orthogonal. 
1 


2 XI. CoMpLex SYMMETRIC, SKEW-SYMMETRIC, ORTHOGONAL MATRICES 
I=I=C ,P=FE,K=K=—K'™ (2) 


2. If, in addition, G is a posttwe-defimte herinitian matriz,? then in 
(1) T=E and 
G=eé*, (3) 
Proof. Let 
G=>S+4+1T, (4) 


where S and T are real matrices. Then 
G=S —iT and GT = ST + aT. (5) 


Therefore the equation G = G™ implies that S = ST and T =— T™, ie., S is 
symmetric and T skew-symmetric. 

Moreover, when the expressions for G and @ from (4) and (5) are sub- 
stituted in the complex equation GG = E, it breaks up into two real equations: 


S?+77=E and ST=TS. (6) 


The second of these equations shows that S and T commute. 

By Theorem 12’ of Chapter IX (Vol. I, p. 292), the commuting normal 
matrices S and T can be carried simultaneously into quasi-diagonal form by 
a real orthogonal transformation. Therefore® 


S= Q {s,, 81, So, 89, +» os 89, 89» Soq+1> e “ay @) Q, 
0 4] || O wi . | o « 
| press —t, 0 


(the numbers s, and ¢ are real). Hence 


(Q=Q=Q™) (7) 


=o on 0he 


. 8, 8, /| {| sy tt, s, tt,| 
G=S8+1T=Q . ; ; prices eda |r eee 1,(8 
—tl, 38; —tly 8 —th, 8 ily nf 2-8) 


On the other hand, when we compare the expressions (7) for S and T 
with the first of the equations (6), we find: 


si—ti=1, §—tf=1,..., 8&8 -—M=1, g,.=4+1,..., 8 =+1. (9) 


"Te, G is the coefficient matrix of a positive-definite hermitian form (see Vol. I, 
Chapter X, § 9). 
* See also the Note following Theorem 12’ of Vol. I, Chapter IX (p. 293) 
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Now it is easy to verify that a matrix of the type ess oe 


can always be represented in the form 


‘| with s?— ??=] 


e ti ls ol 
agile all 


—titt 8 


where 


|s|=coshg, et=sinhy, e=signa. 


Therefore we have from (8) and (9): 


il 0 " | 0 «| 0 | 
G=Q{zet™ ON, peli™ © ~m ON SL... ENO, (10) 
i.e., 
G=Ie'K | 
where 
1=@(+1,4+1,..., 40>, 
oe es 
and 
IK = KI. 


From (11) there follows the equation (2). 
2. If, in addition, it is known that G is a positive-definite hermitian 
iatrix, then we can state that all the characteristic values of G are positive 
(see Volume I, Chapter IX, p. 270). But by (10) these characteristic values 


are 
+ e?1, +e, + es, + eM, .,., +e, +o HW, 4+),...,41 
(here the signs correspond to the signs in (10)). 


Therefore in the formula (10) and the first formula of (11), wherever 
the sign + occurs, the + sign must hold. Hence 


I=Q{1,1,...;Q-'=E, 


and this is what we had to prove. 
This completes the proof of the lemma. 
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By means of the lemma we shall now prove the following theorem : 
TuHEOREM 1: Every complex orthogonal matrix Q can be represented in 
the form | 
Q= Re*, (12) 
where R ts a reat orthogonal matriz and K a real skew-symmetric matriz 


R=R=R-), K=K=—K. (13) 


Proof. Suppose that (12) holds. Then 


Qr=G =e RT 


and 
Q*Q =e RReAX —e?'* , 


By the preceding lemma the required real skew-symmetric matrix K can 
be determined from the equation 


Q*Q= eM (14) 


because the matrix Q*@Q is positive-definite hermitian and orthogonal. After 
K has been determined from (14) we can find RF from (12): 


R= Qe** (15) 


Then 
R*R=c*2§Q*Qe"X= EE; 


i.e, R is unitary. On the other hand, it follows from (15) that A, as the 
product of two orthogonal matrices, is itself orthogonal: RT™R=E. Thus 
R is at the same time unitary and orthogonal, and hence real. The formula 
(15) ean be written in the form (12). 

This proves the theorem.‘ 

Now we establish the following lemma: 


Lemma2: If amatriz D is both symmetric and unitary (D=DT=D-*), 
then tt can be represented in the form 
D=e', (16) 


where S is a real symmetric matrix (S = S = ST). 


+The formula (12), like the polar decomposition of a complex matrix (in connection 
with the formulas (87), (88) on p. 278 of Vol. IT) has a close connection with the important 
Theorem of Cartan whieh establishes a certain representation for the automorphisms of 
the complex Lie groups; see [169], pp. 232-233. 
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Proof. We set — 7 
D=U+iW (U=U,V=V). (17) 
Then = 
D=U—ivV, DT=UT+4+1V". 


The complex equation D = D* splits into the two real equations 
U=U0", V="V. 


Thus, U and V are real symmetric matrices. 
The equation DD =£ implies: 


U2+V2=E, UV=VU. (18) 


By the second of these equations, U and V commute. When we apply 
Theorem 12’ (together with the Note) of Chapter IX (Vol. I, pp. 292-3) 
to them, we obtain: 


U=Q (8), 8a, 05+, 9O!, V= Q(t, fa, .--. 8397. (19) 


Here s, and ¢#, (k=1, 2,..., ”) are real numbers. Now the first of the 
equations (18) yields: 


e+@=1. (k=1,2,...,2). 
Therefore there exist real numbers q;, (kK=1, 2,..., ») such that 
8,=cosg,, t—sing, (kK=1,2,...,%). 
Substituting these expressions for s; and t, in (19) and using (17), we find: 


D=Q{e™, ef, oeey ef} Q-'! = eS 


where 
S=Q (91 Por sess Pr} Qq". (20) 


From (20) it follows that S= 9 = S. 
This proves the lemma. ji 
Using the lemma we shall now prove the following theorem: 


THEOREM 2: Every unitary matrix U can be represented tn the form 
U = Re, (21) 
where & is a real orthogonal matrix and S a real symmetric matrix 


R=R=R'-1, S=S=S". (22) 
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Proof. From (21) it follows that 
UT= eR", (23) 
Multiplying (21) and (23), we obtain from (22): 
UTU =eSR' Ret8 =e**5, 
By Lemma 2, the real symmetric matrix S can be determined from the 
equation 
UTU =e (24) 
because UTU is symmetric and unitary. After S has been determined, we 
determine RF by the equation 


R=Ue"S, (25) 
Then 
R'=e—8U', (26) 


and so from (24), (25), and (26) it follows that 
R'R=e—8U' UeS =E, 
i.e., R is orthogonal. 

On the other hand, by (25) FR is the product of two unitary matrices 
and is therefore itself unitary. Since R is both orthogonal and unitary, 
it is real. Formula (25) can be written in the form (21). 

This proves the theorem. 


§ 2. Polar Decomposition of a Complex Matrix 


We shall prove the following theorem : 
THEOREM 3: If A= | ix |i 1s a non-singular matrix with complex 
elements, then 
A=8SQ (27) 
and 
A=9Q:41, (28) 


where S and 8; are complex symmetric matrices, Q and Q,; complex orthogo- 
nal matrices. Moreover, 
S=YAAT=f(AA"), S,=VATA=f,(A™A), 


where f(A), f1(A) are polynomials in A. 
The factors S and Q im (27) (Q; and 8S; in (28)) are permutable if and 
only if A and A‘ are permutabdle. 
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Proof. It is sufficient to establish (27), for when we apply this decom- 
position to the matrix A‘ and determine A from the formula thus obtained, 
we arrive at (28). 

If (27) holds, then 


, A=8Q, A'=@2"8 
and therefore 
AA*= §?, (29) 
Conversely, since AA’ is non-singular (|AA‘| =| A |? 40), the function 


Va is defined on the spectrum of this matrix® and therefore an interpola- 
tion polynomial f(/) exists such that 


VAA"=f(AA"). (30) 
We denote the symmetric matrix (30) by 
S=yVAA". 
Then (29) holds, and so | S| 540. Determining Q from (27) 
Q=S8"4, 


we verify easily that it is an orthogonal matrix. Thus (27) is established 
If the factors S and Q in (27) are permutable, then the matrices 


A=S8Q and A'=Q'8S 
are permutable, since 
AA'=S?, A"A=Q'8°Q. 
Conversely, if A4A*= ATA, then 
= 9-789, 


i.e, 9 is permutable with S*= AA™. But then Q is also permutable with 
the matrix Y=f(AA"). 
Thus the theorem is proved completely. 


2. Using the polar decomposition we shall now prove the following theorem : 
5 See Vol. I, Chapter V, §1. We choose o single-valued branch of the function Vi 


in a simply connected domain containing all the characteristie values of AA‘, but not the 
nuniber 0. 
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THEOREM 4: If two complex symmetric or skew-symmetric or orthogonal 
matrices are similar : 
B=T "AT, (31) 
then they are orthogonally similar ; 1.¢., there exists am orthogonal matrix Q 
such that 
B=Q1AQ. (32) 


Proof. From the conditions of t}.c theorem there follows the existence 
of a polynomial qg(/4) such that 


A'=q(A), B'=q(B)- (33) 


In the case of symmetric matrices this polynomial q(A) is identically equal 
to A and, in the case of skew-symmetric matrices, to —A. If A and B are 
orthogonal matrices, then qg(A) is the interpolation polynomial for 1/2 on 
the common spectrum of A and B. 

Using (33), we conduct the proof of our theorem exactly as we did the 
proof of the corresponding Theorem 10 of Chapter IX in the real case 
(Vol. I, p. 289). From (31) we deduce 


q(B)=T*g(A) Tt 
or by (33) ~—s 
B= T—A'T. 
Hence. 
B=TAT—, 


Comparing this equation with (31), we easily find: 
TT A=ATT. (34) 
Let us apply the polar decomposition tc -vhe non-singular matrix T 
T=SQ (S=S'=/f (TT), Q=Q). 


Since by (34) the matrix TT™ is permutable with A, the matrix 
S=f(TT") is also permutable with A. Therefore, when we substitute the 
product SQ for T in (31), we have 


B=Q—S— ASQ=Q-14Q. 


This completes the proof of the theorem. 
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§ 3. The Normal Form of a Complex Symmetric Matrix 


1. We shall prove the following theorem:. 


THEOREM 5: There exists a complex symmetric matrix with arbitrary 
preassigned elementary divisors.® 


Proof. We consider the matrix H of order n in which the elements of 
the first superdiagonal are | and all the remaining elements are zero. We 
shall show that there exists a symmetric matrix S similar to H: 


S= THAT. (35) 
We shall look for the transforming matrix T starting from the conditions: 
S=THTHA=S=T"1f7'T". 


This equation can be rewritten as 


VH=H'f, (36) 
where V is the symmetric matrix connected with T by the equation’ 
TT =—2iy. (37) 


Recalling properties of the matrices H and F = H" (Vol. I, pp. 13-14) 
we find that every solution V of the matrix equation (36) has the following 
form: 


0... O @ | 
Ay 


va (38) 
0 Qo , 

ao a, © 6 © Gynt 
where do, @1,..., @,—1 are arbitrary complex numbers. 


Since it is sufficient for us to find a single transforming matrix 7’, we 
set @ = 1, a; =... = @,—1 =0 in this formula and define V by the equation® 


0...0 1 
pao --- 1 off, a 
1...0 0 


6In conneetion with the contents of the present section as well as the two sections 
that follow, §§ 4 and 5, see [378]. 

7 To simplify the following formulas it is convenient to introduce the factor — 2. 

& The matrix V is both symmetric and orthogonal. 
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Furthermore, we shall require the transforming matrix T to be symmetric: 
/i—— aie (40) 
Then the equation (37) for T can be written as: 
T? —— 24V. (41) 


We shall now look for the required matrix T in the form of a polynomial 
in V. Since V?= £, this can be taken as a polynomial of the first degree: 


Teck + BY. 
From (41), taking into account that V? = E, we find: 
a2+ B2=0, 2a8B——2. 
We can satisfy these relations by setting a=1, B=—v7. Then 
T= E—+yv. (42) 
T is a non-singular symmetrix matrix.® At the same time, from (41): 
P= 5iVAP=5ivT. 
Le., 
T1=— (E +39). (43) 


Thus, a symmetric form S of H is determined by 


0.4.01 
§=THT = + (Biv) H(E+iv), V=|o°°* OF. aw 
1...0 0 


Since S satisfies the equation (36) and V?=E, the equation (44) can 
be rewritten as follows: 


2S=(H + H')+4(HV —VQ#) 


O11... 0 0... 2 0 
ro es 
—1 
| ee ee ees |) | en en | (45) 
ee a 1. : 
0... 1 ‘0 ae aera | 


9 The fact that T is non-singular follows, in particular, from (41), because V is non- 
singular. 
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The formula (45) determines a symmetric form S of the matrix H. 

In what follows, if » is the order of H, H = H™), then we shall denote the 
corresponding matrices 7, V, and S by T™), V™ and §™), 

Suppose that arbitrary elementary divisors are given: 


(A oe. A)”, (A — A,)?s, eoey (A ae A,)P* . (46) 
We form the corresponding Jordan matrix 
J={1, EO) + He, ApH) +. Hs), 2. A, EP 4. Au) }. 


For every matrix H‘) we introduce the corresponding symmetric form 
SS), From 


Se) = Te) HE) (Ten}— (G=1,2, ..., uw) 
it follows that 
A,E@n + SO) = PO) [ ABP) + HON) [Tr , 


Therefore setting 


S= (A,B + 809, 28+ G0, ..., 2, BOW +4 sew}, (47) 
T={ TO), Td), ..., Tew}, (48) 
we have: 
S=TJTH, 


Sisa symmetric form of J. § is similar to J and has the same elementary 
divisors (46) as J. This proves the theorem. 


CoroLLary 1. Every square complex matriz A= || au ||? 48 similar to a 
symmetric matriz. 
Applying Theorem 4, we obtain: 


CorouLary 2. Every complex symmetric matrix S = | Ax || ts orthogo- 
nally similar to a symmetric matrix with the normal form S, i.c., there exists 
gn orthogonal matriz Q such that 


§S=Qsq-. (49) 
The normal form of a complex symmetric matrix has the quasi-diagonal 
form 


S={ A, E+ 80, 2B + 8), ..., AO + gow }, (50) 


where the blocks S‘*) are defined as follows (see (44), (45)) : 
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go) — aS EO) — iO] HO [B+ iV) 
=5 (H@) + HT 4. ¢ (H®) Vo) — Ve) H))] 
ol... O lo ... | “| 
1 - ‘ . = . ° ° an | 
: | er ee : iif oy 
Sr a a & Seba. Se 4 i (51) 
0 1 0 MO dae OF 


§ 4, The Normal Form of a Complex Skew-symmetric Matrix 


I. We shall examine what restrictions the skew symmetry of a matrix 
imposes on its elementary divisors. In this task we shall make use of the 
following theorem: 


THEOREM 6: :A skew-symmetric matrix always has even rank. 

Proof. et r be the rank of the skew-symmetric matrix K. Then K has 
r linearly independent rows, say those numbered 1, 22,... , 4; all the remain- 
ing rows are linear combinations of these r rows. Since the columns of K 
are obtained from the corresponding rows by multiplying the elements by 
— 1, every column of K is a linear combination of the columns numbered 
41, te, ..., %. Therefore every minor of order r of K can be represented in 
the. form 


where a is a constant. 
Hence it follows that 


x (* te ee \n0. av 


1 Se eS, 
But a skew-symmetric determinant of odd order is always zero. There- 
fore r is even, and the theorem is proved. 


TuroreM 7: If A, 7s a characteristic value of the skew-symmetric matrix 
K with the corresponding elementary divisors 


(A—Ap)/, (A— Ay), eaey (A—Ap)ft, 


then —- A, is also a characteristic value of K with the same number and the 
same powers of the corresponding elementary divisors of K 
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(A+ Ap)a, (A+Ap) fa, «2.5 (A+ Ag) 4. 


2. If zero is a characteristic value of the skew-symmetric matrix K,'° 
then in the system of elementary divisors of K all those of even degree cor- 
responding to the characteristic value zero are repeated an even number 
of tumes. 

Proof. 1. The transposed matrix K™ has the same elementary divisors 


as K. But K™=—K, and the elementary divisors of — K are obtained 
from those of K by replacing the characteristic values 41, Ao, .... by —A1, 
—dA ,.... Hence the first part of our theorem follows. 


2." Suppose that to the characteristic value zero of K there correspond 46; 
elementary divisors of the form A, d. of the form A?, ete. In general, we 
denote by 6, the number of elementary divisors of the form 4? (p = 1, 2,...). 
We shall show that 62, d,,... are even numbers. 

The defect d of K is equal to the number of linearly independent charac- 
teristic vectors corresponding to the characteristic value zero or, what is the 
same, to the number of elementary divisors of the form A, 4”, 45, .... There- 
fore 

d=6,+ 69+ 63 +>°°. (52) 

Since, by Theorem 6, the rank of K is even and d=" —r,d has the same 
parity asm. The same statement can be made about the defects ds, ds, .. . of 
the matrices K?, K*, ..., because odd powers of a skew-symmetric matrix 
are themselves skew-symmetric. Therefore a!l the numbers ad; = d, d3,ds,... 
have the same parity. 

On the other hand, when K is raised to the m-th power, every elementary 
divisor A? for p < m splits into p elementary divisors (of the first degree) 
and for p = m into m elementary divisors.*' Therefore the number of ele- 
mentary divisors of the matrices K, K*,... that are powers of A are deter- 
mined by the iat 

dg=6, + 26,+3(6,+6,+°°°), 
a he a + 484 +5 (dg + Og + ), (53) 


Comparing (52) with (53). and bearing in mind that all the numbers 
d, = d, dz, ds,... are of the same parity, we conclude easily that 62, d4,... are 
even numbers. 

This completes the proof of the theorem. 


10 Te, if | K |=0. For odd n we always have | K | =0. 

11 See Vol. I, Chapter VI, Theorem 9, p. 158. 

12 These formulas were introduced (without reference to Theorem 9) in Vol. I, Chapter 
VI (see formulas (49) on p. 155). 
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2. THeEoreM 8: There exists a skew-symmetric matriz with arbitrary pre- 
assigned elementary divisors subject to the restrictions 1., 2. of the pre- 
ceding theorem. 


Proof. To begin with, we shall find a skew-symmetric form for the 
quasi-diagonal matrix of order 2p: 


JQ? ={ AE +H, —AgE—H} (54) 
having two elementary divisors (A—4d,)? and (A+4,.)?; here E=E) 
H=H°%?), 

We shall look for a transforming matrix T such that 
TILT 
is skew-symmetric, i.e., such that the following equation holds: 
TILT 4 7 [FLY T= 0 
or 
WIR + [IL W=0, (55) 
where W is the symmetric matrix connected with T by the equation" 
T= —2tW. (56) 
We dissect W into four square blocks each of order p: 
v= a ag 
Wer Wee 
Then (55) can be written as follows: 
by ite ‘ +H O 
Wo W oo O —A,E —H 
ei O ts Wie\ _ 
OQ —AE—HA™) \We Wool. 
When we perform the indicated qperations on the partitioned matrices 


on the left-hand side of (57), we replace this equation by four matrix 
equations: 


O. (57) 


H' Wit Wy (24E+H)=0, 
H'W,,—W,.H=0, 
H'W,.,—W,,H=—0, 
H™ Woo + Woo (2 Ag + H) =O. 


(58) 


ee 


13 See footnote 7 on p. 9» 
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The equation AX — XB =O, where A and B are square matrices without 
common characteristic values, has only the trivial solution Y =0O.1* There- 
fore the first and fourth of the equations (58) yield: W1;= Wo2=0."° 
As regards the second of these equations, it can be satisfied, as we have seen 
in the proof of Theorem 5, by setting 


O ea ce. Oe 
ee sy 20 
Wig = V= : ’ (59) 
1 0 0 
since (cf. (36) ) 
VH—H'V=0O. 
From the symmetry of W and V it follows that 
Wo — Wi.= VY. 
The third equation is then automatically satisfied. 
Thus, 
O V 
= = yp?pP), 60) 
e ty a 


But then, as has become apparent on page 10, the equation (56) wi!! he 
satisfied if we set ° 


T = Ei2?) — 47, (61) 
Then 
[- => (EE?) 4 gy, (62) 
Therefore, the required skew-symmetric matrix can be found by the formula’ 
KS?) = > (Ez? p) eee iy? Py Fh as [zE° p) 4 iy? ny 
= Ferrer Ve _ yer sgPny. (63) 
When we substitute for J%” and V2) the corresponding partitioned 
matrices from (54) and. (60), we find: 


14 Sée Vol. I, Chapter VIII,:§ 1. 


15 For A, 0 the equations 1. and 4. have no solutions other than zero. For A=0 
there exist other solutions, but we choose the zero solution. 


16 Here we use equations (55) and (60). From these it follows that 
Veep) JP) yer) — —FP?"T 
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1{/H-H"™ 0O ((AEtH oO Oo YV 
KY? — — 4% 
ee Oo H'—H)' \..0 —AaE—H/\V O 


(yo) (Po ana) 


7 ( H-—H™ i(2aV+HV+ as (64) 
~ 2\-i(24,V + HV + VA) H'—H 
1.€., 
OG Iewae ea 0:0......€ 2A 
—] 0° 2Ao t 
ot 
a 
. :/ 
So | ee ee an ea eer OT (65: 
0 : —t —24,:0 —1 ol | 
—24, —i il 0 
- | _1| 
st > : . | 
| 240 A me OO i] 0] 


We shall now construct a skew-symmetric matrix 4‘ of order g having 
one elementary divisor A%, where q is odd. Obviously, the required skew- 
symmetric matrix will be similar to the matrix 


©: A Os oe 
001 
Jo = fee 4s (66) 
os, ee | 
-QO -1 
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In this matrix all the elements outside the first superdiagonal are equal to 
zero, and along the first superdiagonal there are at first (¢g—1)/2 ele- 
ments 1 and then (q—1)/2 elements —1. Setting 


RK 2= TJ) 7, (67) 
we find from the condition of skew-symmetry : 
Wy ae Jor W, — 0 ; (68) 
where 


TT =—2iWy. (69) 


By direct verification we can convince ourselves that the matrix 


0 . . 01 
W.Va Oo. 10 
1. Ol 


satisfies the condition (68). Taking this value for W, we find from (69), 
as before: 


T= Hiv, T= = [B® + iv), (70) 
K® = > oak _ iv) J” re” fe iy] 
aes = ry a pot ar (y@ y® — yp? J®)). (71) 


When we perform the corresponding computation, we find: 


O Ae 4 eee ees 0 0 1 0 
—1 0 
: l 
9K — Me ae HW 4¢g (72) 
ae . 
0 1 0; -0 —!1 oe Bie 0 


Suppose that arbitrary elementary divisors are given, subject to the 
conditions of Theorem 7: 
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(A—A)"s, (A+A)™ (j=1, 2, ..., u), 
Ate (k=1, 2, ..., U3 Gy, Go -- +» Gp are odd numbers).?” (73) 


Then the quasi-diagonal skew-symmetric matrix 
K — { Ke") : ko 2 K* K@ \ (7 4) 
: he : seg 
has the elementary divisors (73). 


This concludes the proof of the theorem. 


CoroLuary: Every complex skew-symmetric matrix K 1s orthogonally 
similar to a skew-symmetric matrix having the normal form K determned 
by (74), (65), and (72) ; 2.e., there exists a (complex) orthogonal matrix Q 
‘such that 


K=QKqQ-. - (15) 


Note. If K is a real skew-symmetric matrix, then it has linear ele- 
mentary divisors (see Vol. I, Chapter IX, § 13). 


A—ip,. At+ip,,...,A—ipy, A+ipny, A,..., A (q, are real numbers). 


v times 


In this case, setting all the p;=1 and all the g,=1 in (74), we obtain as 
the norma! form of a real skew-symmetric matrix 
Pe | ere of. 


k={} 0 F1 | 0 Pu 
§ 5. The Normal Form of a Complex Orthogonal Matrix 


P1 0 —¢, 9 


1. Let us begin by examining what restrictions the orthogonality of a 
matrix imposes on its elementary divisors. 


THEOREM 9: 1. If 4, (Aj 1) is a characteristic value of an orthogonal 
matriz Q and if the elementary divisors 


(A a Ag)", (A a Ao)**, coe (A a Ay)* 


17 Some of the numbers Ay, 42, ..., Au may be zero. Moreover, one of the numbers u 
and v may be zero; i.e., in some eases there may be clementary divisors of only one type. 
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correspond to this characteristic value, then 1/Ao 18 also a characteristic value 
of Q and it has the same corresponding elementary divisors : 


(A—ATHK, (A—aASt, 2... (A—Azt)*. 


2. If 44 = +4118 a characteristic value of the orthogonal matrix Q, then 
the elementary divisors of even degree corresponding to A, are repeated an 
even number of times. 

Proof. 1. For every non-singular matrix Q on passing from Q to Q7! 
each elementary divisor (A—A,)’ is replaced by the clementary divisor 
(A—A>')f38 On the other hand, the matrices Q and Q* always have the 
same elementary divisors. Therefore the first part of our theorem follows 
at once from the orthogonality condition Q' = Q7? 

2. Let us assume that the number 1] is a characteristic value of Q, while 
—lisnot (| £—Q/=0,|2+@Q/ 40). Then we apply Cayley’s formulas 
(see Vol. [, Chapter IX, § 14), which remain valid for complex matrices. 
We define a matrix K bv the equation 


K =(E—Q) (£Z+Q)". (76) 


Direct verification shows that K* = — K, so that K is skew-symmetric. 
When we solve the equation (76) for Q, we find :'® 


Q=(E—K) (£+K)—". 


| : 
Setting f(A) = i we have f’(A) = —a7 ~0. Therefore in the transi- 


tion from K to @=f(K) the elementary divisors do not split.2? Hence in 
the system of elementary divisors of Q those of the form (4 — 1)? are re- 
peated an even number of times, because this holds for the elementary 
divisors of the form 42? of K (see Theorem 7). 

The case where Q has the characteristic value — 1, but not + 1, is reduced 
to the preceding case by considering the orthogonal matrix — Q. 

We now proceed to the most complicated case, where Q has both the 
characteristic value +1 and —1. We denote by w(A) the minimal poly- 
nomial of Q. Using the first part of the theorem, which has already been 
proved, we can write y(A) in the form 


18 See Vol. I, Chapter VI, §7. Setting f(2.) = 1/A, we have f’(}) =— 1/1? 0. 
Hence it follows that in the transition from Q to Q-’ the clementary divisors do not split 
(see Vol. I, p. 158). 

19 Note that (76) implies that FE + K = 2(E + Q)~ and therefore 

JE+ K(=2-/B+Q/* 40. 

20 See Vol. I, p. 158. 
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p(A)=(A—1)™ (At I" IT (A— ay (a— (ARAL; §=1,2,... , u). 


We consider the polynomial g(4) of degree less than m (m is the degree 
of p(A)) for which g(1) =1 and all the remaining m —1 vaiues on the 
spectrum of @ are zero; and we set :”? 


Note that the functions (g(A) )? and g(1/) assume on the spectrum of Q 
the same values as g(A). Therefore 


P?=P, P'=g(Q")=g9(Q) =P, (78) 


ie., P is a symmetric projective matrix.?? 
We define a polynomial h(A) and a matrix NV by the equations 


h(a) =(A—1)9 (A), (79) 
N= h(Q)= (Q— E) P. (80) 


Since (h(A))™ vanishes on the spectrum of Q, it is divisible by w(A) 
without remainder. Hence: 


N™=0, 


ie., V is a nilpotent matrix with m, as index of nilpotency. 
From (80) we find :?$ 


NT= (Q" —E) P. (81) 
21 From the fundamental formula (see Vol. I, p. 104) 


g (A) = S" [g (4x) Zar + 9” (Az) Zea + +] 
kel 


it follows that 
p= Zn. 

22 A hermitian operator P is called projective if P? =P. In accordance with this, 
a hermitian matrix P for which P?= P is ealled projective. An example of a projective 
operator P in a unitary space R is the operator of the orthogonal projection of a vector 
xeR into a subspace S = PR, i.e., Px = Xg. where xg€ S and (x—-x,) 1 S (see Vol. I, 
p. 248). 

243 All the matrices that oceur here, P, N, NT, @'—Q-} are permutable among each 
other and with Q, since théy are all functions of Q. 
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Let us consider the matrix 
R=N(N'+22). (82) 
From (78), (80), and (81) it follows that 
R= NN*+2N=(Q—Q’) P. 


From this representation of # it is clear that R is skew-symmetric. 
On the other hand, from (82)- 


R= Ne(N'+2E) (k=1,2,...). (83) 
But N*, like N, is nilpotent, and therefore 
|N+28|A0. 


Hence it follows from (83) that the matrices R* and N* have the same rank 
for every k. 

Now for odd & the matrix R* is skew-symmetric and therefore (see p. 12) 
has ever rank. Therefore each of the matrices 


NN INP NE es 


has odd rank. 

By repeating. verbatim for N the arguments that were used on p. 13 
for K we may therefore state that among the elementary divisors of N those 
of the form (J? are repeated an even number of times. But to each ele- 
mentary divisor 4? of N there corresponds an elementary divisor (A — 1)? 
of Q, and vice versa.”* Flence it follows that among the elementary divisors 
of Q those of the form (A — 1)? are repeated an even number of times. 

We obtain a similar statement for the elementary divisors of the form 
(A + 1)?? by applying what has just been proved to the matrix — Q. 

Thus, the proof of the theorem is complete. 


2. We shall now prove the converse theorem. 


24 Since h(1) = 0, h’(1) #0, in passing from Q@ to N =h(Q) the elementary divisors 
of the form (A — 1)?? of Q do not split and are therefore replaced by elementary divisors 
4? (see Vol. I, Chapter VI, § 7). 
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THEOREM 10: Every system of powers of the form 


(A—Ajy"i, (A— Aj *)? (A140; F=1, 2, ..., u), 

(A— 1), (A—1)®, ..., A— 1)”, (84) 
(A+ 1)8, (A+ 1)%, ..., (A+ 1) * 
(Gis +++ Gyr f1, +, ty are odd numbers) 


is the system of elementary divisors of some complex orthogonal matriz Q.”° 
Proof. We denote by yu; the numbers connected with the numbers 4; 
(j7=1, 2,..., 2) by the equations 
Aj=e"4 (F=1,2,..-, 4) 


We now introduce the ‘canonical’ skew-symmetric matrices (see the pre- 
ceding section ) 


Kyf? j= 1,2... 0) KO, KO; KO, Kew, 
with the elementary divisors 
(A—p,)", (A + w)' (j=, 3, .. +, u) Au, 2.2, Ato, Ab, ..., Ae. 
If K is a skew-symmetric matriz, then 
Q=et 


is orthogonal (Q* = eX" =e—-* = Q-). Moreover, to each elementary divi- 
sor (A — u)? of K there corresponds an elementary divisor (A — e#)? of Q.”° 
Therefore the quasi-diagonal matrix 


KAPAP1) gPuPu) (Ms) Kao) hh) 
@ 9° @ . Se: 9° 


~ (te) 
Q={em ,..,€% 5 ve ; uy 


Sas (85) 
is orthogonal and has the elementary divisors (84). 

This proves the theorem. 

From Theorems 4, 9, and 10 we obtain: 


25 Some (or even all) of the numbers Ay may be +1. One or two of the numbers 
wu, v, w may be zero. Then the elementary divisors of the corresponding type are absent 
in q. 

26 This follows from the fact that for f(A) = e4 we have f’(A) ==e4 0 for every A. 
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CoroLLaRY: Every. (complex) orthogonal matriz Q 1s orthogonally 
similar to an orthogonal matriz having the normal form Q; i.e., there exists 
an -orthogonal matriz Q, such that 


Q=@,09-1. (86) 


Note. Just as we have given a concrete form to the diagonal blocks in the 
skew-symmetric matrix K, so we could for the normal form @.*" 


27 See [378]. 


CHAPTER XII 


SINGULAR PENCILS OF MATRICES 


§ 1. Introduction 


1. The present chapter deals with the following problem: 


Given four matrices A, B, Ai, B, all of dimension m X n with elements 
from a number field ¥, it 1s required to find under what conditions there 
exist two square non-singular matrices P and Q of orders m and n, respec- 
tively, such that? 


PAQ=A,, PBQ=B, (1) 


By introduction of ‘h.; pencils of matrices 4+4B and A; + 4B, the 
two matrix equations (1) «an be replaced by the single equation 


P(A+AB)Q=A,+ AB, (2) 


DEFINITION 1: Two pencils of rectangular matrices A + AB and A; + AB, 
of the same dumensions m X n connected by the equation (2) tn which P and 
Q are constant square non-singular matrices (i.c., matrices independent of 
A) of orders m and n, respectively, will be called strictly equivalent.’ 


Aceording to the general definition of equivalence of J-matrices (see 
Vol. I, Chapter VI, p. 132), the pencils A + AB and A; + AB, are equivalent 
if an equation of the form (2) holds in which P and Q are two square 
A-matrices with constant non-vanishing determinants. For strict equivalence 
it is required in addition that P and @ do not depend on A.’ 

A criterion for equivalence of the pencils A + AB and A, + AB, follows 
from the general criterion for equivalence of A-matrices and consists in the 
equality of the invariant polynomials or, what is the same, of the elementary 
divisors of the pencils A + AB and A, + AB, (see Vol. I, Chapter VI, p. 141). 


1Tf such matrices P and @Y exist, then their clements can be taken from the field F. 
This follows from the fact that the equations (1) ean be written in the form P.f = 4,Q", 
PB = B,Q"' and are therefore equivalent to a ccrtain system of linear homogencous equa- 
tions for the elements of P and Q-' with coefficients in F. 

2 See Vol. I, Chapter VI, p. 145. 


3 We have replaced the term ‘equivalent pencils’ that occurs in the literature by 
‘strictly equivalent pencils,’ iu order to draw a sharp distinetion between Definition 1 and 
the definition of equivalence in Vol. I, Chapter VI. 
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In this chapter, we shall establish a criterion for strict equivalence of 
two pencils of matrices and we shall determine fu” each pencil a strictly 
equivalent canonical form. 


Z. The task we have set ourselves has a natural geometrical interpretation. 
We consider a pencil of linear operators A + AB mapping R, into R,,. For 
a definite choice of bases in these spaces the pencil of operators A + AB cor- 
responds to a pencil of rectangular matrices A -+ AB (of dimension m X n) ; 
under a change of bases in R, and Ry the pencil A + AB is replaced by a 
strictly equivalent pencil P(A +AB)Q, where P and Q are square non- 
singular matrices of order m and nm (see Vol. I, Chapter IIT, §§ 2 and 4). 
Thus, a criterion for strict equivalence gives a characterization of that class 
of matrix pencils A + 2B (of dimension m X n) which describe one and the 
same pencil of operators 4 + AB mapping R, into R,, for various choices of 
bases in these spaces. 

In order to obtain a canonical form for a pencil it is necessary to find 
bases for R, and R,, in which the pencil of operators A + AB is described by 
matrices of the simplest possible form. 

Since a pencil of operators is given by two operators A and B, we can 
also say that: The present chapter deals with the suemultaneous tnvestigation 
of two operators A and B mapping R, into R,,. 

3. All the pencils of matrices 4 + 4B of dimension m X n fall into two 
basic types: regular and singular pencils. 


DEFINITION 2: <A pencil of matrices A + AB 1s called regular tf 
1) A and B are square matrices of the same order n; and 


2) The deternunant | A+ AB | does not vanish identically. 
In all other cases (mn, or m=n but | A + AB | =0), the pencil is called 
singular. 

A criterion for strict equivalence of regular pencils of matrices and also 
a canonical form for such pencils were established by Weierstrass in 1867 
[377] on the basis of his theory of elementary divisors, which we have ex- 
pounded in Chapters VI and VII. The analogous problems for singular 
pencils were solved luter, in 1890, by the investigations of Kronecker [249].* 
Kronecker’s results form the primary content of this chapter. 


§ 2. Regular Pencils of Matrices 


1. We consider the special case where the pencils A +AB and A, +AB, 
consist of square matrices (m=n) | B| +0, | By | 40. In this case, as we 
have shown in Chapter VI (Vol. I, pp. 145-146), the two concepts of ‘equiv- 

4Qf more recent papers dealing with singular pencils of matrices we mention [234], 
[369], and [255]. 
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alence’ and ‘strict equivalence’ of pencils coincide. Therefore, by applying 
to the pencils the general criterion for equivalence of A-matrices (Vol. I, 
p. 141) we are led to the following theorem: 


THEOREM 1: Two pencils of square matreces of the same order A+ AB 
and A; + AB, for which | B | 40 and | B,| ~Oare strictly equivalent tf and 
only if the pencils have the same elementary divisors in F. 

A pencil of square matrices A + AB with | B | +0 was called regular in 
Chapter VI, because it represents a special case of a regular matrix poly- 
nomial in A (see Vol. I, Chapter IV, p. 76). In the preceding section of this 
chapter we have given a wider definition of regularity. According to this 
definition it is quite possible in a regular pencil to have | B | =0 (and even 
| 4|=|B|=0). 

In order to find out whether Theorem 1 remains valid for regular pencils 
(with the extended Definition 1), we consider the following example: 


213 ees | 211 111 
A+AB=|13 2 5/4: 4'/1 1 2), 4,4+AB,=,1 21) 4+A4)/1 1 If. (3) 
326 lia sl Mra 111 


It is easy to see that here each of the pencils A + AB and A, + AB, has 
only one elementary divisor, 4+ 1. However. the pencils are not strictly 
equivalent, since the matrices B and B, are of ranks 2 and 1, respectively ; 
whereas if an equation (2) were to hold, it would follow from it that the 
ranks of B and B, are equal. Nevertheless, the pencils (3) are regular 
according to Definition 1, since 


|A+AB| =| A,+AB,|=4+1. 


This example shows that Theorem 1 is not true with the extended defini- 
tion of regularity of a pencil. 


2. In order to preserve Theorem 1, we have to introduce the concept of ‘in- 
finite’ elementary divisors of a pencil. We shall give the pencil A + AB in 
terms of ‘homogeneous’ parameters A, wu: uA +AB. Then the determinant 
A(A, #) =| uA+ AB] is a homogeneous function of A, wu. By determining 
the greatest common divisor D,(A, u) of all the minors of order & of the 
matrix wA + AB (k=1, 2,..., ”), we obtain the invariant polynomials by 
the well known formulas 


; _ DalAu) Dy-1 (A, #) : 
AG w= Bea? ha = Dera aye ooo} 


here all the D,(A, #) and 7;(A, w) are homogeneous polynomials in 4 and vp. 
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Splitting the invariant polynomials into powers of homogeneous polynomials 
irreducible over F, we obtain the elementary divisors ég(A, uw) (a=], 2,...) 
of the pencil nA + AB in F. 

It is quite obvious that if we set «= 1 in e,(A, &%) we are back to the ele- 
mentary divisors €,(4) of the pencil A+4AB. Conversely, from each ele- 
mentary divisor é,(4) of degree g we obtain the correspondingly elementary 
divisor €a(A, u) b; the formula ez (4, u) = p* ex (-). We can obtain in this way 
all the elementary uivisors of the pencil 4A + AB apart from those of the 
form 4%. 

Elementary divisors of the form y? exist if and only if | B | =0 and are 
called ‘infinite’ elementary divisors of the pencil A + AB. 

Since strict equivalence of the pencils 4 +AB and A, + AB, implies 
strict equivalence of the pencils uA + AB and pA; + AB, we see that for 
strictly equivalent pencils A + AB and A, + AB, not only their ‘finite,’ but 
also their ‘infinite’ elementary divisors must coincide. 

Suppose now that .+AB and A,+AB, are two regular pencils for 
which all the elementary divisors coincide (including the infinite ones). 
We introduce homogeneous parameters: uA + AB, uA, +A1B,;. Let us now 
transform the parameters 


A =a,A + Mp, p = BA + Bop (2B. — &B, 540). 


In the new parameters the pencils are written as follows: 
na + iB pA, + AB, ; where B = pA + a,B, B, = BA, To a,B,. 


From the regularity of the pencils uA + AB and vA, + AB, it follows that we 
can choose the numbers a, and #,; such that | B | 40 and | B, | ~ 0. 

Therefore by Theorem 1 the pencils #.A + AB’ and fA, + AB, and con- 
sequently the original pencils uA + AB and uA, + AB, (or, what is the same, 
A+AB and A, +AB;) are strictly equivalent. Thus, we have arrived at 
the following generalization of Theorem i: 


TuEorEM 2: Two regular pencils A+AB and A, + AB, are strictly 
equivalent if and only if they have the same (‘fintte’ and ‘enfinite’) ele- 
mentary divisors. 


In our example above the pencils (3) had the same ‘finite’ elementary 
divisor 4+ 1, but different ‘infinite’ elementary divisors (the first pencil 
has one ‘infinite’ elementary divisor »?; the second has two: uw, u). Therefore 
these pencils turn out to be not strictly equivalent. 
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3. Suppose now that A +/B is an arbitrary regular pencil. Then there 
exists a number ¢ such that | A+cB | 40. We represant the given pencil 
in the forra A, + (A—c)B, where A, =A+cB, so that |A1|~0. We 
multiply the pencil on the left by Ay!: E+ (A—c)Aji'B. By a similarity 
transformation we put the pencil in the form® 


E+ (A—c) {Jo,J,}={E—cJnt+ Mo, E—eJ,-+ a4}, (4) 


where {Jo, 41} is the quasi-diagonal normal form of Aj!B, Jo is a nilpotent 
Jordan matrix,® and | J; | +0. 

We multiply the first diagonal block on the right-hand side of (4) by 
(# —c?,)—! and obtain: FE +A(E —cJ,)—'Jo. Here the coefficient of A 
is a nilpotent matrix.’ Therefore by a similarity transformation we can 
‘put this pencil into the form® 


B+d5,=( Ne, NO), 2, NOY} (NO = BO 4 10), (5) 


We multiply the second diagonal block on the right-hand side of (4) by 
Jz!; it can then be put into the form J + AE by a similarity transformation, 
where J is a matrix of normal form® and E the unit matrix. We have thus 
arrived at the following theorem : 


THEOREM 3: Every regular pencil 4 + AB can be reduced to a (strictly 
equivalent) canonical quasi-diagonal form 


[N), NO, 2, NO, T+ AB) (NM = BO + AH), (6) 


where the first s diagonal blocks correspond to infinite elementary divisors 
pe, ue? 2. Of the pencil A+ AB and where the normal form of the last 
diagonal block J+AE ws uniquely determined by the finite elementary 
divisors of the given pencil. — 


>The unit matrices E in the diagonal blocks on the right-hand side of (4) have the 
same order as Jo aud Ji. 

6 f,e., Ju’ = O for some integer | >.0. 

7 From Ju! = O it follows that [ (2 — cJo)~ Jo]! = O. 

8 Here E() is a unit matrix of order u and H(*) is a matrix of order u whose elements 
in the first superdiagonal are 1, while the remaining elements are zero. 

® Since the matrix J can be replaced here by an arbitrary similar matrix, we may 


assume that J has one of the normal forms (for example, the natural form of the first 
or second kind or the Jordan form (see Vol. I, Chapter VI, § 6)). 
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§ 3. Singular Pencils. The Reduction Theorem 


1. -We now proceed to consider a singular pencil of matiives A+ 1B of 
dimension m X m. We denote by r the rank of the pencil, i.2., the largest 
of the orders of minors that do not vanish identically. From the singu- 
larity of the pencil it follows that at least one of the inequalities r < n and 
r<m holds, say r<n. Then the columns of the A-matrix 4+ 1B are 
linearly dependent, i.e., the equation 


(A +AB)z=0, (7) 


where z is an unknown column matrix, has a non-zero sclution. Every 
non-zero solution of this equation determines some dependence among the 
columns of 4 + AB. We restrict ourselves to only such solutions z(A) of (7) 
as are polynomials in 4,!° and among these solutions we choose one of !east 
possible degree «e: 


(A) =a — Ary + Marg— ee + (Wa, (e340). (8) 


Substituting this solution in (7) and equating to zero the coefficients of 
the powers of A, we obtain: 


Azy=0, Buy— Ax ;=0, Br,— Axg=0,..., Bu1— Ax,-=0, Bau,=o. (9) 


Considering this as a system of hnear homogeneous equations for the 
elements of the columns 25, — 21, +Ze..., (—1)‘°x., we deduce that the 
coefficient matrix of the system 


e+1 


TS 


O 


(10) 


AO... 
B AD 
M.=M.[A+A4B}=| 9 8 


0 0...B 
is of rank oc. << (e+1)n. At the same time, by the minimal property of «, 


the ranks Og, 0:1, ..-, Qe—1 of the matrices 


10 For the actual determination of the elements of the column -s satisfying (7) it is 
convenient to solve a system of lincar homogeneous equations in which the coefficients of 
the unknown depend linearly on A. The fundamental linearly independent solutions xr can 
always be chosen such that their elements are polynomials in A. 
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Ao...O 

A A O BA . 
m,=(4). M,=(B Aj,...,Ma=[°° 0 0! (10’) 

0 B a. 

O . B 


satisfy the equations Q9= 7, 0, = 2n, ..., Oe—1 en. 
Thus: The number e ts the least value of the index k for which the sign 
< holds in the relationo, S (kK +1)n. 


Now we can formulate and prove the following fundamental theorem : 
2. THEOREM 4: If the equation (7) has a solution of minimal degree e and 


é > 0, then the gwen pencil 4+ AB 18 strictly equivalent to a pencil of 
the form 


i 4 ,) (11) 
O A+AB 
where 
é+1 
A 1 0 . 0 °| 
a a | 
L,= e, (12) 


0 oO er ecas 2M | 

and A+ AB is a pencil of matrices for which the equation analogous to (7) 
has no solution of degree less than e. 

Proof. Weshall conduct the proof of the theorem in three stages. First, 


we shall show that the given pencil A + AB is strictly equivalent to a pencil 
of the form 


he bie ay 
O A+AB)’ 

where D, F, A, B are constant rectangular matrices of the appropriate 
dimensions. Then we shall establish that the equation (A + AB)Z=0 has 
no solution z2(A) of degree less than ¢. Finally, we shall prove that by 
further transformations the pencil (18) can be brought into the quasi- 
diagonal form (11). i 
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1. The first part of the proof will be couched in geometrical terms. 
Instead of the pencil of matrices A + AB we consider a pencil of operators 
A-+AB mapping R,, into R,, and show that with a suitable choice of bases 
in the spaces the matrix corresponding to the operator A + AB assumes the 
form (13). 

Instead of (7) we take the vector equation 

(4 +AB)x=0o (14) 
with the vector solution 


x (A) =%y— AB, + APH_— +0 + (— IP AX,; (15) 


the equations (9) are replaced by the vector equations 


Ax,=—0, Ax,=Bx,, Ax,= Bx,,..., Ax,= Bx,_,, Bx,=o (16) 


Below we shall show that the vectors 
Ax,, Ax2,..., Ax, (17) 


are linearly independent. Hence it will be easy to deduce the linear inde- 
pendence of the vectors 


Hoy By coey Be (18) 

For since Ax,—o we have from a)% 9+ @%,+°+++ @,%, =o that 

a, Ax,+---+ a, A x,=0,s0 that by the linear independence of the vectors 
(17) a, =a.=...=a,=0. But x, 0, since otherwise =x (A) would be 


a solution of (14) of degree e —1, which is impossiblé. Therefore a, = 0 
also. 

Now if we take the vectors (17) and (18) as the first e + 1 vectors for 
new bases in R,, and R,, respectively, then in these new bases the operators 
A and B, by (16), will correspond to the matrices 


e+1 e+i 
0 1 ~-- O #*® .., ] 0 * 2k 
001. 0 x * 0 O « . * 
A=0 0 1 # oc el? F=lo 0..11 0 & oe wll 
0 0 -O0 x a 00...0 O w... * 
ie bs Cah ES wes St. BS ve ae -0 0 * . * 
0 0 0 * ., «| 00...9 0 ©... * 
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hence the A-matrix 4 + AB is of the form (13). All the preceding argu- 
ments will be justified if we can show that the vectors (17) are linearly 
independent. Assume the contrary and let Ax, (h = 1) be the first vector 
in (17) that is linearly dependent on the preceding ones: 


Ax, = a,Ax,_. + AAX,_» + oer + a, ,Ax, ° 


By (16) this equation can be rewritten as follows: 


Bx,_, = %,Bx,_. + @,Bx,_5+-+-+ %_,Bx), 
1.é., 
Bxy_,=0, 
where 


is —— © Hom 
%,y — By _y — 8 %__9 — &o%,_3 — °° &y_1%- 
Furthermore, again by (16), 
x — oe * 
Ax, _, = B (,_.— &%,_3—*** = %_2%) = Bx;,_. ; 
where 


* _— oso — 
Xo = Fae — 2 Fn_3 — A,_o¥Xy- 
Continuing the process and introducing the vectors 
. ; ot * 
Xp _g = Bp_g — Hy y_qg— 79° Ly_ gh, - 0 0, By = Hy — AyXq, My — Xs 


we obtain a chain of equations 


Bx* ,.=0, Ax}_,=Bx}_,,..., Ax{=Bxj, Ax, =e. (19 


From (19) it follows that 
x” (A) = x5 — AR tee + (— 1 e_,  (%) = %y HK) 


is a non-zero solution of (14) of degree = h—1 < é, which is impossible 
Thus, the vectors (17) are linearly independent. 


2 We shall now show that the equation(A -+ AB) z =ohas no solution 
of degree Jess than e. To begin with, we okserve that the equation Ley = « 
like (7), has a non-zero solution of least degree &«. We can see this imme 
diately, if we replace the matrix equation L.y=o by the system of ordinar 
equations 


Ay, 4 yg= 0, Ayo t+ yg =0,.~-,AYe + Yor =O (Y=(Yis Yor -++> Yerr)); 


yp =(—l)t-ly, AF“! (k= 1, 2,...,€4+ 1). 
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On the other hand, if the pencil has the ‘triangular’ form (13) then the 
corresponding matrix pencil M, (k=0,1,....e) (see (10) and (10’) on 
pp. 29 and 30) can also be brought into triangular form, after a suitable 
permutation of rows and columns: 


M [LJ] M,(D+aFl 
( O ar 


For k = e— 1 all the columns of this matrix, like those of M._: [Z,],are 
linearly independent.’ But M,_: [Z.Jis a square matrix of order e(e + 1). 
Therefore in M,1[A + 4B] also, all the columns are linearly independent 
and, as we have explained at the beginning of the section, this means that the 
equation (A + 1B) = = o has no solution of degree less than or equal to « — 1, 
which is what we had to prove. 


(20) 


8. Let us replace the pencil (13) by the strictly equivalent pencil 
: D+ A * = _ & D+AF+Y(A4+AB) ac 


oe = ae , (21) 
O &,/\O A+AB E,} \o A+AB 


where E,, E2, E;, and E, are square unit matrices of orders ¢, m —é, € + 1, 
and »—«e—1, respectively, and X, Y are arbitrary constant rectangular 
matrices of the appropriate dimensions. Our theorem will be completely 
proved if we can show that the matrices X and Y can be chosen such that the 
matrix equation 
L,X=D+aF+Y(4+AB) (22) 
holds. 
We introduce a notation for the elements of D, F, AX and also for the 


rows of Y and the columns of A and B: 


D=|\dy |, F=\fall, X=|l ell 
($=1,2,...,@; k=1,2,...,.n—e—1; j=1,2,...,e+]), 


[yr 
a eae A= (Ay) Qo) 605 Gey), B= (by, ba, ---s On —p-t) + 
Ye 
Then the matrix equation (22) can be replaced by a system of scalar equa- 


tions that expresses the equality of the elements of the k-th column on the 
right-hand and left-hand sides of (22) (k=1,2,...,n—e—1): 


11 This follows from the fact that the rank of the matrix (20) for k= &€ — 1 is equal 
to en; a similar equation holds for the rank of the matrix M,—; [Le]. 
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Go, + Any =dy + Afi, + ya, + Ayb,, 
Tay + Ada, = dey + Afey + Yot, + Ayd,, 
Hy + Atty, = dg, + Afgy + YgQ, + AYgd;, (23) 


eo ©e¢ je @e@ @ @ @ = @ ee e@ «oe oe 8 #e# «# «6 


Ve+1ek + Ax, =dy a Abs - YA, + Ay by 
(k=1,2,...,n—e—1l). 


The left-hand sides of these equations are linear binomials in 4. The 
free term of each of the first e — 1 of these binomials is equal to the coeffi- 
cient of 4 in the next binomial. But then the right-hand sides must also 
satisfy this condition. Therefore 


Yd, — Y2d, = fap — Az, 
Yot, — Yad, = fay — daz» 
Se Se. Set adw ks Ae oe age ta, (24) 
Y,_10, — 9,0, =f — 41, 
(kK=1,2,...,n—e—1). 


If (24) holds, then the required elements of X can obviously be determined 
from (23). 

It now remains to show that the system of equations (24) for the ele- 
ments of Y always has a solution for arbitrary dy and fy (¢=—1, 2,..., €; 
k=1,2,...,~—e—1). Indeed, the matrix formed from the coefficients 
of the unknown elements of the rows 41, — y2, Y3, — Ya, ---, can be written, 
after transposition, in the form 


But this is the matrix M,._. for the pencil of rectangular matrices A+AB 
(see (10’) on p. 30). ‘The rank of the matrix is (e —1) (n—«—11), be- 
cause the equation (A + 1B) zx = 0, by what we have shown, has no solutions 
of degree less than «. Thus, the rank of the system of equations (24) is 
equal to the number of equations and such a system is consistent (non- 
contradictory) for arbitrary free terms. 


This completes the proof of the theorem. 
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§ 4. The Canonical Form of a Singular Pencil of Matrices 


1. Let 4+AB be an arbitrary singular pencil of matrices of dimension 
m X'n. To begin with, we shall assume that neither among the columns 
nor among the rows of the pencil is there a linear dependence with constant 
coefficients. 

Let r < n, where r is the rank of the pencil, so that the columns of A + AB 
are linearly dependent. In this case the equation (A + 4B)z =o has a non- 
zero solution of minimal degree €;. From the restriction made at the begin- 
ning of this section it follows that «,; > 0. Therefore by Theorem 4 the 
given pencil can be transformed into the form 


(oa, 448) 
O A,+AB,} 
where the equation (4, + 4B,) #@)=o0 has no solution x“ of degree less 
than €}. 
If this equation has a non-zero solution of minimal degree e2 (where, 


necessarily, €2 = €,), then by applying Theorem 4 to the pencil A, + AB; 
we can transform the given pencil into the form 


L., O O 
O L., O id 
O O A,+AB 


Continuing this process, we can put the given pencil into the quasi- 
diagonal form 
L., O 


(25) 


\o 
O A, +B, 


where 0 < 6: S 2S... Se, and the equation (A, + 4B,) ar? =o has no 
non-zero solution, so that the columns of A, + AB, are linearly independent.*? 

If the rows of A,+4AB, are linearly dependent, then the transposed 
pencil A> + 2B) can be put into the form (25), where instead of &, &2,..., €, 
there occur the numbers (0 <)7, S$mS-+-sS 7, ."*° But then the given 
pencil A+ AB turns out to be transformable into the quasi-diagonal form 


12 In the special case where e. + e3 + ...+ ep—m the block Ap + AB, is absent. 


12 Since no linear dependence with constant coefficients exists among the rows of the 
pencil 4 + AB and consequently of A» + AB,, we have n, > 0. 
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La, O 


, (26) 


Lng 
O Ay + AB, 


(09<458&S°:'Se&, ON, Sm S°° Sm) 


where ooth the columns and the rows of A, + AB, are ]inearly independent, 
i.e., Ao + AB, is a regular pencil.*® 


2. We now consider the general case where the rows and the columns of 
the given pencil may be connected by linear relations with constant coeffi- 
cients. We denote the maximal number of constant independent solutions 
of the equations 

(A4+A4B)x=o0 and (A™+1B")=0 


by g and h, respectively. Instead of the first of these equations we consider, 
just as in the proof of Theorem 4, the corresponding vector equation 
(4+1B)x=o0 (A and B are operators mapping R, into R,,). We denote 
linearly independent constant solutions of this equation by e1, @e, ..., &; 
and take them as the first g basis vectors in R,. Then the first g column: 
of the corresponding matrix A + AB consist of zeros 


, 
A+AB=(0, A, +AB,). (27) 


Similarly, the first h rows of the pencil A “+ AB, can be made into zeros 
The given pencil then assumes the form 


B 
1[O Oo \. | 
( O eee As 


18 f€ ia the given pene + —: 2, i.e, if the columns of the pencil are linearly independent 
then the first p diagonal blocks in (26) of the form L, are absent (p=0). In the sam 
way, if rm, i.., if the rows of 4 -+AB are linearly independent, ther in (26) th 
diagonal blocks of the form L} are absent (¢=0). 
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where there is no longer any linear dependence with constant coefficients 
among the rows or the columns of the pencil A° + 4B°. The pencil A°® + AB° 
can now be represented in the form (26). Thus, in the general case, the 
pencil A + AB can always be put into the canonical quasi-diagonal form 


g 
([0, Legsry oe+s Lepr Lrnyys +++» Lng, Ag + ABo}- (29) 


The choice of indices for ¢ and 7 is due to the fact that it is convenient here 
to take ¢,= & ="**= e,=0 and 4 = 4, =":' =m, =D. 

When we replace the regular pencil A) + AB, in (29) by its canonical 
form (6) (see §2, p. 28), we finally obtain the following quasi-diagonal 
matrix ; 


g 
{®[O; Legayy +++, Leps Liga, ---, D4; NO, ..., NO; J+ AE}, (30) 


where the matrix J is of Jordan normal form or of natural normal form and 
NO) = BM +177, 

The matriz (30) is the canonical form of the pencil A + AB in the most 
general case. 

In order to determine the canonical form (30) of a given pencil imme- 
diately, without carrying out the successive reduction processes, we shall, 
following Kronecker, introduce in the next section the concept of minimal 
indices of a pencil. 


§ 5. The Minimal Indices of a Pencil. Criterion for 
' Strong Equivalence of Pencils 


1. Let A+AB be an arbitrary singular pencil of rectengular matrices. 
Then the & polynomial columns 2;(A), x2(A), ..., &(A) that are solutions 
of the equation 

(A+ AB)2@=0 (31) 


are linearly dependent if the rank of the polynomial matrix formed from 
these columns XY = [2:(A), z2(A), ..., 2z(A)] is less than &. In that case 
there exist k polynomials p,(A), po(A), ..., px(A), not all identically zero, 
such that 


P(A) 2, (A) + po (A) eo (A) +-°+* + p, (A) 2, (AD=O. 


But if the rank of X is k, then such a dependence does not exist and the 
solutions 271(A), Z2(A), ..., Ze(A) are linearly independent. 
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Among all the solutions of (31) we choose a non-zero solution 2,(A) of 
least degree €,. Among all the solutions of the same equation that are lin- 
early independent of 2;(A) we take a solution 2z2(A) of least degree é. 
Obviously, ¢; = €2. We continue the process, choosing among the solutions 
that are linearly independent of z,(4) and x2(A) a solution z,(4) of minimal 
degree €3, etc. Since the number of linearly independent solutions of (31) 
is always at most n, the process must come to an end. We obtain a funda- 
mental series of solutions of (31) 


% (A), (A), .-, Lp(a) (32) 
having the degrees 


€,Se,5++Se,, (33) 


In general, a fundamental series of solutions is not uniquely determined 
(to within scalar factors) by the pencil A+ AB. However, two distinct 
fundamental series of solutions always have one and the same series of 
degrees &1, €2,..., € . For let us consider in addition to (32) another funda- 
mental series of solutions 71(A), Z2(A), ... with the degrees &, @, .... 
Suppose that in (33) 


E) 99 = bn, < Engi 8° En: e 


and similarly, in the series &1, é., . 


o@y 
Ey 8 = Ea, < Ea i mt En <i. 


Obviously, ¢;= &. Every column x%(A) (i= 1, 2,..., ”,) is a linear com- 
bination of the columns 2;(A), z2(A), ..., Z,(A), since otherwise the solu- 
tion 2n,41(A) in (32) could be replaced by 2,(4), which is of smaller degree. 
It is obvious that, conversely, every column 2,(4) (t=1, 2,..., m) is a 
linear combination of the columns %1(A), 2(A), ...,Za,4+1(A). Therefore 
Mm =, and €n,4, =€x,4, Now by a similar argument we obtain that 
No = Ne and En, += Ex. ete. 


2. Every solution z,(4) of the fundamental series (32) yields a linear 
dependence of degree e, among the columns of 4+ AB (k=1, 2,..., p) 
Therefore the numbers ¢€1, &2, ..., & are called the minimal indices for the 
columns of the pencil A + AB. 

The minimal indices 1, N2,-. +1, for the rows of the pencil A + AB are 
introduced similarly. Here the equation (A +AB)z=o is replaced by 
(AT + AB™)y =o, and m1, n2,..-, %q are defined as minimal indices for th 
columns of the transposed pencil A™ + ABT 
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Strictly equivalent pencils have the same minimal indices. For let 
A+ AB and P(A+AB)Q he two such pencils (P and Q are non-singular 
square matrices). Then the equation (31) for the first pencil can be written, 
after multiplication on the left by P, as follows: 


P(A +AB)Q:Q1'2 =o. 


Hence it is clear that all the solutions of (31), after multiplication on the 
left by Q—-}, give rise to a complete system of solutions of the equation 


P(A+AB)Qz=0. 


Therefcre the pencils 4 +AB and P(A +AB)Q have the same minimal 
indices for the columns. That the minimal indices for the rows also coincide 
can. be established by going over to the transposed pencils. 

Let us compute the minimal indices for the canonical quasi-diagonal 
matrix 


g 
{aro Deg +1 ede | Li»; Linea ee ey Lig» Ay + 2B, (34) 


(Ay + AB, is a regular pencil having the normal form (6)). 

We note first of all that: The complete system of indices for the columns 
(rows) of a quasi-diagonal matriz is obtained as the unwon of the corresponda- 
ing systems of minimal indices of the individual diagonal blocks. The matrix 
L, has only one index « for the columns, and its rows are linearly independ- 
ent. Similarly, the matrix DL’, has only one index 7 for the rows, and its 
columns are linearly independent. Therefore the matrix (34) has as its 
minimal indices for the columns 


++ =e,-—0, 5+. eoey ED 


and for the rows 
Up aaa ™m=0, Nh+ir seer Nee 


We note further that DZ, has no elementary divisors. since among its 
minors of maximal order « there is one equal to 1 and one equal to 4°. The 
same statement is, of course, true for the transposed matrix L’. Since the, 
elementary divisors of a quasi-diagonal matrix are obtained by combining 
those of the individual diagonal blocks (see Volume I, Chapter VI, p. 141), 
the elementary divisors of the A-matriz (34) coincide with those of its regular 
‘kernel’ A, + ABo. 

The canonical form of the pencil (34) is completely determincd by the 
minimal mdtces &,.-., €,1,---» Ng and the elementary divisors of the 
pencil or, what.is the same, of the strictly equivalent pencil A+AB. Since 
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two pencils having one and the same canonical form are strictly equivalent, 
we have proved the following theorem : 


THEOREM 5 (Kronecker): Two arbitrary pencils A+ AB and A, + AB, 
of rectangular matrices of the same dimension m X nare strictly equivalent 
if and only uf they have the same minimal indies and the same (finite and 
infinite) elementary divisors. ~ 


In conclusion, we write down, for purposes of illustration, the canonical 
form of a pencil A+AB with the minimal indices e,;=0, e2=1. 6, = 2, 
"1 =0, n2=0, 73 = 2 and the elementary divisors A*, (A + 2)?, p3:* 


0: | 
ae 
(A 1 0 | 
OAL 
A 0 
lA: i 
et / 
LAO: 
ola 
et i 
i | 
Oa | 
‘A421 = «| 
gl hee | 
§ 6. Singular Pencils of Quadratic Forms 
1. Suppose given two complex quadratic forms: 
n n 
A (x, x) = 2 Matte, B(x, 2) = bX j0y3 (36) 
ite es 


they generate a pencil of quadratic forms A(x,z) +AB(z,x). This pencil! 
of forms corresponds to a pencil of symmetric matrices A +AB (AT=A 
B'=B). If we subject the variables in the pencil of forms A(xz,2) + AB(z,2) 
to a non-singular linear transformation «= Tz (| T | 0), then the trans 
formed pencil of forms A(z,z) +AB(z,z) corresponds to the pencil o! 
matrices 


15 All the elements of the matrix that are not mentioned expressly are zero. 
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A+AB=T"(A+AB)T (37) 
here T is a constant (i.e., independent of A) non-singular square matrix of 
order n. 

Two pencils of matrices 4 + AB and A + AB that are connected by a rela- 
tion (36) are called congruent (see Definition 1 of Chapter X ; Vol. I, p. 296). 
Obviously, congruence is a special ease of equivalence of pencils of 
matrices. However, if congruence of two pencils of symmetric (or skew- 
symmetric) matrices is under consideration, then the concept of congruence 
coincides with that of equivalence. This is the content of the following 
theorem. 


THEOREM 6: Two strictly equivalent pencils of complex symmetric (or 
Skew-symmetric) matrices are always congruent. 


Proof. Let A=A+t+AB and A=A+AJB be two strictly equivalent 
pencils of symmetric (skew-symmetric) matrices: 


A=PAQ (A7T=+A, AT=4A; |P|%0, |Q|40). (38) 


By going over to the transposed matrices we obtain: 


A=Q'AP". (39) 
From (38) and (39), we have 
AQP*— = P—Q'A. (40) 
Setting 
U=QP"—, ° (41) 
we rewrite (40) as follows: 
AU=U'A: (42) 


From (42) it follows easily that 


AUE=U"A (k=0,1, 2,..,) 
and, in general, 


AS=S*A, (43) 
where 


S=f(U), (44) 


and f(A) is an arbitrary polynomial in 4. Let us assume that this poly- 
nomial is chosen such that | S| +40. Then we have from (43): 


A=s"AS—, (45) 
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Substituting this expression for A in (38), we have: 
A= PSTAS—1Q. (46) 


If this relation Is to be a congruence transformation, the following equa- 
tion must be satisfied : 
(PS*)" = SQ, 
which can be rewritten as 
S?=QP™— =U. 


Now the matrix S=f(U) satisfies this equation if we take as f(A) the 
interpolation polynomial for ¥4 on the spectrum of U. This can be done, 
because the many-valued function V4 has a single-valued branch determined 
on the spectrum of U, since | U | <0. 

The equation (46) -now becomes the condition for congruence 


A=TAT (T=SQ=YVQP™—Q).. (47) 


From this theorem and Theorem 5 we deduce: 


COROLLARY: Two pencils of quadratic forms 
A(z, x)+AB(z,2) and A(z, z)+ AB (2, 2) 


can be carried into one another by a transformation x= Tz (| T| 0) if 
and only if the pencils of symmetric matrices A + AB and A+ JB have the 
same elementary divisors (finite and infinite) and the same minimal indices. 

Note. For pencils of symmetric matrices the rows and colrmns have the 
same minimal indices: 


P=G &=M,---, &p= Np (48) 


2. Let us raise the following question: Given two arbitrary complex quad- 
ratec forms 


n a 
a 
A (zx, x) = Ps Ay X Xz » B (x, x) ee Py bi, 2 jX,. 
t, fen] kool 


Under what conditions can the two forms be reduced simultaneously to 
sums of squares 
n 


p> az? and bz? (49) 


by a non-singular transformation of the variables x= Tz (| 7'| 40)? 
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Let us assume that the quadratic forms A(z,x) and B(2z,x) have this 
property. Then the pencil of matrices 4 + AB is congruent to the pencil 
of diagonal matrices 


{a, + Ab,, @y + Aby,..., dy + AB,}. (50) 


Suppose that among the diagonal binomials a,+ Ab, there are precisely r 
(r Sn) that are not identically zero. Without loss of generality we can 
assume that 


a, = b= 0,..., a,_,= b,_,=90, a4 4 +0 “(= n—r4+]1,...,n). (51) 


Setting 


Ay + AB = {@q_o41 + Abn_iptie +++ Gq + Adz}, (52) 


we represent the matrix (51) in the form 


n~-? 


{O, Ag +AB). (53) 


Comparing (52) with (34) (p.39), we see that in this case all the minimal 
indices are zero. Moreover, all the elementary divisors are linear. Thus 
we have obtained the following theorem: 


THEOREM 7: Two quadratic forms A(z,r) and B(x, x) can be reduced 
simultaneously to sums of squares (49) by a transformation of the variables 
if and only if in the pencil of matrices A+: AB all the elementary divisors 
(finite and infinite) are linear and all the minimal indices are zero. 


In order to reduce two quadratic forms A(z,2) and B(x, 27) simulta 
neously to some canonical form in the general case, we have to replace the 
pencil of matrices A+ AB by a strictly equivalent ‘canonical’ pencil of 
symmetric matrices. 

Suppose the pencil of symmetric matrices A + AB has the minimal! indices 


&,)=...=e,=0, 6,4, 40, ..., #0, the infinite elementary divisors 
pe’, e, ...,u“* and the finite ones (4 + 4,)°, (A+ A,)%,...,(A + 4)% Then, 
in the canonical form (30), g=h, p=q and &,41;=%+1,---, p=. We 


replace in (30) every two diagonal blocks of the form D, and L; by a single 


diagonal block (7 - and each block of the form N“ = E™ + AH) by the 
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strictly equivalent symmetric block 


0 0...0 1! 

00...0 1 0 Osis 7 °| 
NO=pwym=|]0 O--- 2 AP oie y= _ (54) 

1a...0 0 - I | 

{1 0...0 0| 


Moreover, instead of the regular diagonal block J+ AEF in (30). (J is a 
Jordan matrix) 


J+ AB=((A+A,) BO + HO, ..., (A+) BO + HO), 


we take the strictly equivalent block 


{Z, ..., ZH} , (55) 
where 
ZM—= VO((A + A,) BO + HO] 


0... O AFA |i 
O ... AtA 1 


= | ee eee |e es eer (56) 


oo a ane | 


The pencil A + AB is strictly equivalent to the symmetric pencil 


A+AB 
0 Lk O L, . 
=o. is Pag fern ng Ne), ..., Ne; Zi, ph. (57) 
eg-+1 a) 


Two quadratic forms with complex coefficients A(z,z) and B(x, x) can 
be simultaneously reduced to the canonical forms A (2,2) and B (2,2) defined 
in (57) by a transformation of the variables x = Tz (| T | 0). 


17 Tn the Russian edition the author stated that ppopositions analogous to Theorems 6 
and 7 hold for hermitian forms. A. I. Mal’cev has pointed out to the author that this is 
not the case. As regards singular pencils of hermitian forms, see [197 II]. 
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§ 7. Application to Differential Equations 


1. The results obtained will now be applied to a system of m linear differ- 
ential equations of the first order in m unknown functions with constant 
coefficients :1° 


n n dx . 
Xi tate Xb Gh ((=1,2,...,m), (58) 


or in matrix notation: 
Ax + BS =4 (0); (59) 

here?® : ; 

A=|laz||, B=||b,|| (¢=1, 2,...,m; R=1, 2,...,%), 

B= (hy, Ry oes Beds, P= fas eee bad 

We introduce new unknown functions 23, 22, ..., 2, that are connected 


with the old 21, 22, ..., 2 by a linear non-singular transformation with 
constant coefficients: 
w=Q2 (z= (21) Za --+> 2n)3_ [Q/ FO). (60) 


Moreover, instead of the equations (58) we can take m arbitrary inde- 
pendent combinations of these equations, which is equivalent to multiplying 
the matrices A, B, f on the left by a square non-singular matrix P of order m. 
Substituting Qz for z in (59) and multiplying (59) on the left by P, we 
obtain: 


Ae BE =f), (61) 
where 7 — a Pa " 
A=PAQ, B=PBQ, F=Pf=(ivly--t- (62) 
The matrix pencils 4+ AB and A+B are strictly equivalent: 
A+AB=P(A+4AB)Q. (63) 


We choose the matrices P and Q such that the per.cil A+4AB has the 
canonical quasi-diagona] form 


18 The particular case where m= 7 and the system (58) is solve 1 with respect to the 
derivatives has been treated in detail in Vol. I, Chapter V, § 5. 

It is well known that a system of linear differential equations witn constant coeffi- 
cients of arbitrary order s can be reduced to the form (58) if all the derivatives of the 
unknown functions up to and including the order s — 1 arc included as additional unknown 
functions. 

19 We recai! that parentheses denote column matrices. Thus, = (%, %2,...,%n) is the 
column with the elements 2:, 2, ..., 2. : 
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AMIB 10, Dea iipcneg des Tain LE, NO, .0., NO), T+ 2B). (64) 


In accordance with the diagonal blocks in (64) the system of differential 
equations splits into v= p —g +q—hA+s + 2 separate systems of the form 


y 2 
O-z=f, (65) 
1l4if 
d\14+i  “~ ‘ 
Luly) ® =F (i=1,2,...,p—g), (66) 
-  paveserieg. POY 
Ln (35) = (Gj =1,2,...,9—4), (67) 
a pagegnnsd P—9+q—-h+1+k 
wen (EPP k=), (68) 
d\+ = 
(7+5)2=7, (69) 
where 
1 
z } 
2 = 
. ~ f 
e=f . |, f= / |, (70) 


Nee. 


~n? @ ° e 


1 pe ~ ~ 2 2 ~ 
z= (2,, ce) Z5)s f= (f,; ice f,)s 4 = (Z41) oc )y f=(na1 ie .) etc., (71) 


d d : 
Alp, )=A+ BR, if A(a)=A+AB. (72) 


Thus, the integration’ of the system (59) in the most general case is 
reduced to the integration of the special systems (65)-(69) of the same type. 
In these systems the matrix pencil A + AB has the form O, L,, L,, N‘) , and 
J +AE, respectively. 


1) The system (65) is not inconsistent if and only if 


1.€., 
f,=0,...,f/,=0. (73) 
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In that case we can take arbitrary functions of ¢ as the unknown functions 
1 
21, 2, ..., 2g that form the columns z. 
2) The system (66) is of the form 


d = 
L.(5;)2=7 (74) 
or, more explicitly,”° 
d ~ d ne d OX, 
taht tae hs.. git tn hlé. (75) 


Such a system is always consistent. If we take for z.,,(¢) an arbitrary 
function of ¢, then all the remaining unknown functions z, , 2,-1,..., 21 can 
be determined from (75) by successive quadratures. 


3) The system (67) is of the form 


Li (y)2=7 (78) 


or, more explicitly,”? 
dz, 7 dz, 
moh, g +2 =f, (t),... "» ee t+ tq =f, (0), Zq=fno1 (0). (77) 


From all the equations (77) except the first we determine 2,, 27-1, ..., 21 
uniquely : 
zn — f nt1s 


~ ad 
t= hea ae (78) 


=H, af, a Ont 
j-th... ~+(- -1)” Phe, 
Substituting this expression for 2; into the first equation, we obtain the 
condition for consistency : 


df, 
j, th 4 Se... re em Po —(. (79) 


20 We have changed the indices of z and f to Simplify the notation. In order to return 
from (75) to (66) we have to replace ¢ by eg: and add to each incdlex of 2 the number 

9+ &g41 +° + + 8944—-1-+%—1 , to cach index of f the number h +e Egti test egti—i - 

21 Here, us in the preceding case, we have changed the notation. See the preceding 
footnote. 
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4) The system (68) is of the form 


d me 
NW) (5) 2==f . (80) 

or, more explicitly, 
tah, 2 + tp fas ees ay th tes hes aes (81) 


Hence we determine successively the unique solutions 


Zu tes 
er dh 
a f w-1l°— dt? (82) 


a af, aa Os 
a | lige? ge : adi“—i ° 


5) The system (69) is of the form 
d ~7 
Jz+ iG =f. (83) 


As we have proved in Vol. I, Chapter V, § 5, the general solution of such 
a system has the form 


i 
Se Mag + f e704 (2) de (64) 
0 


here z is a column matrix with arbitrary elements (the initial values of 
the unknown functions for {=0). 

The inverse transition from the system (61) to (59) is effected by the 
formulas (60) and (62), according to which each of the functions 21, ..., 2» 
is a linear combination of the functions 2;,..., 2, and each of the functions 
fi(t),..., fm(t) is expressed linearly (with constant coefficients) in terms 
of the functions f,(¢), ..., fm(t). 


2. The preceding analysis shows that: In general, for the consistency of the 
system (58) certain well-defined linear dependence relations (with constant 
coeffictents) must hold among the right-hand sides of the equations and the 
derivatives of. these right-hand sides. 

If these relations are satisfied, then the general solution of the system 
contains both arbitrary constants and arbitriry functions linearly. 

The character of the consistency conditions and the character of the 
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solutions (in particular, the number of arbitrary constants and arbitrary 
functions) are determined by the minimal indices and the elementary divi- 
sors.of the pencil A + AB, because the canonical form (65)-(69) of the sys- 
tem of differential equations depends on these minimal indices and ele- 
mentary divisors. 


CHAPTER XIII 


MATRICES WITH NON-NEGATIVE ELEMENTS 


In this chapter we shall study properties of real matrices with non-negative 
clements. Such matrices have important applications in the theory of 
probability, where they are used for the investigation of Markov chains 
(‘stochastic matrices,’ see [46]), and in the theory of small oscillations of 
elastic systems (‘oscillat'-n matrices,’ see {17]). 


§ 1. General Properties 


1. We begin with some definitions. 
DEFINITION 1: A rectangular mains A with real elements 
A= || a, || (i=1,2,...,m;k=1, 2,..., 0) 


as called non-negative (notation: A = QO) or positive (notation: A > O) if 
all the elements of A are non-negative (ax, = 0) or positive (aux > 0). 


DEFINITION 2: A square matrix A= | Qik ik as called reducible if the 
ondex set 1,2,...,n can be splat into two complementary sets (without com- 


mon INdces) 4, 12, ..., tus Ki, ke, ..., ky (utv=n) such that 
Qiong —9 (a= 1, 23 6404 fi P =1, 2,245 °%). 


Otherwise the matrix is called irreducible. 

By a permutation of a square matrix A = | Qik ie we mean a permutation 
of the rows of A combined with the same permutation of the columns. 

The definition of a reducible matrix and an irreducible matrix can also 
be formulated as follows: 


DEFINITION 2’: A matric A= | Ask (| is called reducible if there is a 
permutation that puts it into the form 
~ {BO 
a=(¢ >) 
C D 


where B and D are square matrices. Otherwise A is called irreducible. 


90 
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Suppose that A= || a, ||? corresponds to a linear operator A in an 
n-dimensional vector space R with the basis e;, €2,...,@,. Toa permutation 
of A.there corresponds a renumbering of the basis vectors, i.e., a transition 
from the basis e), e2,..., €, to a new basis e’, —e,,, eg = e,,..., @, =Ci_s 
where (ji, jo,-.-, Jn) 18 @ permutation of the indices 1, 2,...,”. The matrix 
A then goes over into a similar matrix A=T~-!AT. (Bach row and each 
column of the transforming matrix JT contains a single element 1, and the 
remaining elements are zero.) 


2. By a v-dimensional coordinate subspace of R we mean a subspace of R 
with a basis e,,, e,,,---, &, (lSiki <heo<...<kySn). There are ) 


v-dimensional coordinate subspaces of R connected with a given basis 
€1, @2,...,@,. The definition of a reducible matrix can also be given in the 
following form: 


DeFINniTion 2”: A matriz A= || ax | ts called reducible if and only if 
the corresponding operator A has a v-dimensional invariant coordinate sub- 

“ce with y <n. 

We shall now prove the following lemma: 


LemMMa 1: If A=O is an irreducible matriz of order n, then 
(E+ A)*-1>0. (1) 


Proof. For the proof of the lemma it is sufficient to show that for every 
vector: (i.e., column) y 2 0 (y 0) the mequality 


(E+ A)*-!y > 0 
holds. 

This inequality will be established if we can only show that under the 
conditions y = 0 and y 0 the vector z= (E+ A)y always has fewer zero 
coordinates than y does. Let us assume the contrary. Then y and z have 
the same zero coordinates.? Without loss of generality we may assume that 
the columns y and z have the form? 


1 Here and throughout this chapter we mean by a vector a column of n numbers. [n 
this way we identify, as it were, a vector with the column of ifs coordinates in that basis 
in which the given matrix 4 = || au If’ determines a certain linear operator. 

2 Here we start from the fact that z=y-+ dy and Ay = 0; therefore tu positive 
coordinates of y there correspond positive coordinates of 2. : 


3 The columns y and z can be brought into this form by means of a suitable renumbher- 
ing of the coordinates (the same for y and z). 
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U Je 
i eee 
oO O- a 


where the columns u and v are of the same dimension. 


Setting 
AA A 
A =( 11 12 
\Ao, Ag 
we have 
(o)+ Car Ze(c)=(c) 
oO Ay, Azg/\0 o/” 
and hence 
A440. 
Since u > a, it follows that 
A,, =O. 


This equation contradicts the irreducibility of A. 
Thus the lemma is proved. 
We introduce the powers of A: 


At =|laf? [Tt (q=1, 2,...). 


Then the lemma has the following corollary : 


COROLLARY: Jf A = O ts an trreducible matriz, then for every index 
pari,k (lstksn) there exists a positive integer gq such that 


a? => 0.- (2) 


Moreover, gq can always be chosen within the bounds 
gq=m—1 ftFk, % 
ee @ 
g=m sf +k, 


where m ws the degree of the minimal polynomial y(A) of A. 

For let r(A) denote the remainder on dividing (A+ 1)*—! by w(A). 
Then by (1) we have r(A) > O. Since the degree of r(A) is less than m, 
it follows from this inequality that for arbitrary 74,4 (1 14k 7) at least 
one of the non-negative numbers 


2 m—t1) 
bn; ay an’, oe @>»9 af, 


is not zero. Since 6..—0O for «5k, the first of the relations (3) follows. 
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The other relation (for i=) is obtained similarly if the inequality 
r(A) > O is replaced by Ar(A) > O.* 

Note. This corollary of the lemma shows that in (1) the number n —1 
can be replaced by m —1. 


§ 2. Spectral Properties of Irreducible Non-negative Matrices 


1. In 1907 Perron found a remarkable property of the spectra (i.e., the 
characteristic values and characteristic vectors) of positive matrices.® 


THEOREM 1 (Perron): A positive matriz A= | Oni; i always has a real 
and positive characteristic value r which is a simple root of the characteristic 
equation and exceeds the moduli of all the other characteristic values. To this 
‘maximal’ characteristic value r there corresponds a characteristic vector 
2 = (21, Z2,...,2n) Of A with positive coordinates 2; > 0 (t= 1, 2,..., n).8 

A positive matrix is a special case of an irreducible non-negative matrix. 
Frobenius’ has generalized Perron’s theorem by investigating the spectral 
properties of irreducible non-negative matrices. 


THEOREM 2 (Frobenius): An irreducible non-negative mainz A= 
| On [| always has a positive characteristic value r that is a simple root of 
the characteristic equation. The moduli of all the other characteristic values 
do not exceed r. To the ‘maximal’ characteristic value r there corresponds 
a characteristic vector with positive coordinates. 

Moreover, tf A has h characteristic values Ao = 1, A1,..., An—1 of modulus 
r, then these numbers are all distinct and are roots of the equation 


Ja ph =0 (4) 


More generally: The whole spectrum io, 41, ..., An—1 Of A, regarded as a 
system of points in the complex A-plane, goes over wnto itself under a rotation 


4 The product of au irreducible non-negative matrix and a positive matrix is itself 
positive. 

5 See [316], [317], and [17], p. 100. 

€ Since r is a simple characteristic value, the characteristic vector z belonging to it is 
letermined to within a sealar factor. By Perron’s theorem all the coordinates of z are 
‘eal, different from zero, and of like sign. By multiplying 2 by —1, if necessary, we 
‘an make all its coordinates positive. In the Jatter case the vector (colunin) z= (£1, 22, 
gy +++, 2n) is called positive (as in Definition 1). 

7 See [165] and [166]: 
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of the plane by the angle 2a/h. If h > 1, then A cun be put Oy means of a 
permutation into the following ‘cyclic’ form: 


O Ape... 
OO A,,... 0 
A= . |, (5) 


e 


000 + Anan | 
4,00 ...0 ; 


where there are square blocks alony thr main diagonal. 


Since Perron’s theorem follows as a special case from Frobenius’ 
theorem, we shall only prove the latter. To begin with, we shall agree on 
some notation. 

We write 

C=DorD=C, 


where C and D are real rectangular matrices of the same dimensions m X n 
C=|fex |[, D=||d, || (¢=1,2,...,m; k= 1, 2,..., ), 
if and only if 
Cz sd, (t= 1, 2,...,.m; k=1,2,..., 2). (6) 
If the equality sign can be omitted in ail the :nequalities (6), then we 


shall write 
C<DorD>C. 


In particular, C2 O (> 0) means that all the elements of C are non- 
negative (positive). 

Furthermore, we denote by Ct the matrix mod C which arises from C 
when all the elements are replaced by their moduli. 


2. Proof of Frobenius’ Theorem:® Let r= (11, ra, .--. %n) (0 0) bea 
fixed real: vector. We set: 
‘ (dx); om ‘ 
r, = min ((Ax), = 2 @y,%,: %— 1, 2,..., nm). 
lsitsn ad a at 


In the definition of the minimum we exclude here the values of 2 for which 
x,=0. Obviously r, = 0, and r; is the largest reat number o@ for which 


ox = Ax. 


8 For a direct proof of Perron’s theorem seo [17], p. 100 f/ 
® This proof is due to Wielandt [384]. 
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We shall show that the function r, assumes a maximum value r for some 
vector z= a: 


- oe ; (Az); 
r=r,=mMaxr,=max min —". 


(g20)° (zB0)1lsisn 7 


(7) 


From the definition of r, it follows that on multiplication of a vector 
x = 0 (xo) by a number A > 0 the vaiue of r, does not change. There- 
fore, in the computation of the maximum of r, we can restrict ourselves to 
the closed set M of vectors x for which 


x2o and | (x2) se Dat =1. 
i=l 


If the function r, were continuous on M, then the existence of a maximum 
would be guaranteed. However, though continuous at every ‘point’ z > a, 
r, may have discontinuities at the boundary points of M at which one of its 
coordinates vanishes. Therefore, we introduce in place of M the set N of 
all the vectors y of the form 


y=(E+A)"— 2 (eM). 


The set N, like M, is bounded and closed and by Lemma 1 consists of 
positive vectors only. 
Moreover, when we multiply both sides of the inequality 


r,t = Ax, 


by (H+ A)*—! > O, we obtain: 
rey Ay (y=(E+A)"""r). 


Hence, from the definition of r, we have 


Therefore in the computation of the maximum of r, we can replace M 
by the set N which consists of positive vectors only. On the bounded and 
closed ses NV the function r, is continuous and therefore assumes a largest 
value for some vector z > 0. ° 

Every vector z = o for which 

r,=fr (8) 
will be called extremal. 
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We shall now show that: 1) The number r defined by (7) ts positive 
and 1s a characteristic value of A; 2) Every extremal vector 2 is positive 
and is a characteristic vector of A for the characteristic value r, 1.e., 


r>0, z>0, Az=rz. (9) 
i) 

For if w= (1, 1,..., 1), then r, = min > %y. But tnen r, >0, be- 
Sin, ae lstsSnkwl 


° e °7 e . 
cause no row of an irreducible matrix can consist of zeros only. Therefore 
r>0,sincer=r,. Now let 


x =(H+ A)" z. (10) 


Then, by Lemma 1, z >. Suppose that Az—rz 0. Then by (1), (8), 
and (10) we obtain successively : 


Az—rz20, (FE + A)*—“! (Az—rz)>0, Ax —rz>o0. 


The last inequality contradicts the definition of r, because it would imply 
that Az — (r+e)z>o for sufficiently small ¢ >0, ie, rr-2rt+e>r. 
Therefore Az=rz. But then 


Oo<ae=(H+ A) '2z=(14+r)*'z, 
so that z > o. 
We shall now show that the moduli of all the characteristic values do not 


exceed r. Let 
Ay=ay (yo). (11) 


Taking the moduli of both sides in (11), we obtain :*° 


lalyts Ayt. (12) 
Hence 


lel Srye sr. 
Let y be some characteristic vector corresponding to r: 
Ay=ry (yo). 


Then setting a=r in (11) and (12) we conclude that y+ is an extremal 
vector, so that yt > 0, i.e., y= (Y1, Yo,..-, Yn), Where ¥:540 (1=1, 2,..., 0). 
Hence it follows that only one characteristic direction corresponds to the 
characteristic vector; for if there were two linearly independent character- 
istic vectors 2 and 21, we could chose numbers ¢ and d such that the char- 
acteristic vector: y = cz + dz, has a zero coordinate, and by what we have 
shown this is impossible. 


10 Regarding the notation y+, see p. 54. 
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We now consider the adjoint matrix. of the characteristic matrix 4E — A: 
7 B(A)=|| Bu (A) |t=4 (A) (AE — A), 


where A(A) is the characteristic polynomial of A and By,(A) the algepraic 
complement of the element Ad,4 — ax, in the determinant 4(A). From the 
fact that only one characteristic vector z= (21, 22, ..., 2n) with z; >0, 
Z2 > 0, ..., 2n > 0 corresponds to the characteristic value r (apart from a 
factor) it follows that B(r) + O and that in every non-zero column of B(r) 
all the elements are different from zero and are of the same sign. The same 
is true for the rows of B(r), since in the preceding argument A can be re- 
placed by the transposed matrix A’. From these properties of the rows 
and columns of A it follows that all the By(r) (1, k=1, 2,..., m) are 
different from zero and are of the same sign o. Therefore 


oA’ (r)= 0 2X Be (r)>0, 
=1 


i.e., A’(ro <0 and ris a simple root of the characteristic equation A(A) =0. 
Since r is the maximal root of 4(A) =A" +..., (A) increases for 
A=r. Therefore A’(r) > 0 and c=1, ie., 


Bua(r)>0 (1, &=1,2,..., 2). (13) 


3. Proceeding now to the proof of the second part of Frobenius’ theorem, 
we shall make use of the following interestigg lemma ae 


Lemma 2: If A= | Qk iM and C= } Cix ||? are two square matrices of 
the same order n, where A ts irreducible and’? 


C+SA, (14) 


then: for every characteristic vector y of C and the maximal characteristic 
vector r of A we have the inequality 


lyisr. (15) 
In the relation (15) the equality sign holds tf amd only if 
C= e*DAD—!, (16) 


where e'?=y/r and D is a diagonal matrix whose diagonal elements are of 
umt modulus (D+ =E£). 


12 See [384]. 
12 ¢ is a complex matrix and 4 =& O. 
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Proof. We denote by y a characteristic vector of C curresponding to the 
characteristic value y: 
Cy=yy (y #0). a7) 


From (14) and (17) we find 


ly|ytsCtyt sAy?*. (18) 
Therefore 
lyiSrpsr. 
Let us nuw examine the case |y|=r in detail. Here it follows from 


(18) that y+ is an extremal vector for A, so that yt > 0 and that y' isa 
characteristic vector of dA for the characteristic value r. Therefore the 
relation (18) assumes the form 


Ayt=Ctyt=ryt, yt>o. (19) 
Hence by (14) 
Ot= A. (20) 
Let y = (1, Ya, --->) Yn), Where 


y, =| y, | (j= 1, 2, weey 0). 


We Gc fine a diagonal matrix D by the equation 
D={ e, ef, ..., cfr}. 
Then 
y = Dy". 
Substituting this expression for y in (17) and then setting » = re'?, we 
find easily : 
Fyt= ry", (21) 
where 
F=e"D— cp. (22) 
Comparing (19) with (21). we obtain 
Fy*t=Ctyt = Ayt*. (23) 


But by (22) and (20) 
Ft =Ct=A, 
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Therefore we find from (23) 


Since yt > o, this equation can hold only if 


F=Ft, 
1.€., 
e”"D-ICD =A. 
Hence 
C=e"DAD—|, 


and the Lemma is proved. 


4. Wereturn to Frobenius’ theorem and apply the lemma to an irreducible 
matrix A = O that has precisely h characteristic values of maximal modu- 
lus r: 

Ay =re've, Ay = re, ..., A,_, = re Pht 
(O= Go <G1 <Pe< 0) <1 <2). 


Then, setting C= A and y=A, in the lemma, we have, for every k —0, 
1,...,a—1, 
A= &*D,AD,", (24) 
where D, is a diagonal matrix with D{= E. 
Again, let z be a positive characteristic vector of A corresponding to the 
maximal characteristic value r: 
Az=rz (z>0). (25) 
Then setting 
k k 
y=Dz (yt=2>0), (26) 
we find from (25) and (26): 
Ay=Ay (A, =re'*#; k=0,1,...,4—D). (27) 


The last equation shows that the vectors y, y, a4 ay defined in (26) are 
characteristic vectors of A for the characteristic values Ao, 41, ... , An—1.- 

From (24) it follows not only that 4,=7, but also that each character- 
istic value j,,..., Ax, of A is simple. Therefore the characteristic vectors 


y and hence the matrices D, (k =0,1,..., #— 1) are determined to within 
scalar factors. To define the matrices Do, D;, ..., Dx—1 uniquely we shall 
choose their first diagonal element to be 1. Then D)=E and y=z > 0. 
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Furthermore, from (24) it follows that 
A = 6 (= 9k) D.DETADE*DS* (j,k=0,1,...,h—I). 
Hence we deduce similarly that the vector 
D;Dé*z 
is a characteristic vector of A corresponding to the characteristic value 


ret ea ¢k) : 


Therefore e“% +?) coincides with one of the numbers e* and the matrix 
D, D,*1 with the corresponding matrix D,; that is, we have, for some 
hh, (OS, le =h—1) 


ef (Mitek) = efi, gl (4-9E) = efi, 
= 1 
DD, =D,,,  D,Dp =D,, 

Thus: The numbers e*, e, ..., e*—1 and the corresponding diagonal 
matrices Do, D,, ..., Dry. form two tsomorphic multiplicative abelian 
groups. 

In every finite group consisting of h distinct eleménts the A-th power 
of every element is equal to the unit element of the group. Therefore 
e'%, eM, .., effh—1 are h-th roots of unity. Since there are h such roots of 
unity and go= 0<q, <Gg<°** <Gy_.1<2% , 


= (k=0, I, 2, reg A). 
and 
i= 
ewe et (g=—eM=e 4; L=0,1,...,4—1), (28) 
A,r (k=0,1,...,4—1). (29) 
The numbers do, 41, ..., 4n-1 form a complete system of roots of (4). 
In accordance with (28), we have :** 
D,=DF (D=D,; k=0,1,...,4—1). (30) 


The equation (24) now gives us (for k=1): 


Ps 
A=e * DAD. | (31) 


14 Here we use the isomorphism of the multiplicative groups e, ef1,,,. ,e'm—1 and 
Do, Di, «++ Dy-1. 
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an 
Hence it follows that the matrix A on multiplication by e’* goes over into a 


similar matrix and, therefore, that the whole system of n characteristic 
2n 


values of A on multiplication by e'h goes over into itself. 
Further, 
D=E, 


so that all the diagonal elements of D are h-th roots of unity. By a permuta- 
tion of A (and similarly of D) we can arrange that D be of the following 
quasi-diagonal form: 


D={ NoHo, mB, ++) Neer}; (32) 
where Ey, Hi,..., He—y are unit matrices and 


iy 2x 
Np = eVP, y= Ny = 


(my is an integer; p=0,1,...,s—1;0 << m<...< mii <h). 


Obviously s = hk. 
Writing A in block form (in accordance with (32) ) 


Ay Ar --- Ay 
A=| dp (33) 
An ‘An... Ay 


ve replace (31) by the system of equations 


— 4-1 ; an 
PAyy = 7 Ave ( ,q=1, 2, nee =) (34) 
tence for every p and gq either as =e or Apg =O. 
o— 
Let us take p=1. Since the matrices A,., Ais, ..., As, cannot vanish 
imultaneously, one of the numbers “- 7 Raat Roane) (jo ==1) must be equal 


oe. This is only possible for 7; =1. Then— ae =eand 4, =—A)3,=—...= 


414.— 0. Setting p= 2 in (34), we find similarly that nz = 2 and that Ay; = 
Loo = Ang =... = Ao, =O, etc. Finally, we obtain 


15 The number h is the largest integer having these properties, hecause A has precisely 
characteristic values of maximal modulus 7. Moreover, it follows from (31) that all 
he characteristic values of the matrix fal] into systems (with h numbers in each) of the 
OTM flgs Pols +++ {yeh and that within each such system to any two characteristic 
alues there correspond elementary divisors of equal degree. (ne such system is formed 
y the roots of the equation (4) As, As, ..-» Anqi- 
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000... 
An Ay, Ayy .-- Ags 


Here 1, =1, no=2, ..., n,-y=8s—1. But then for p=s on the right- 
hand sides of (34) we have the factors 


na 


——-=€ ‘ —— ol Wee eee) 
wes (¢ ) 
Qn 
One of these numbers must be equal to e=e'x. This is only possible when 
s=h and q~1; consequently, A,.=..-.= A, = 


Thus, 
D={EKp, eH, ey, ..., &"Ey_)}, 


and the matrix A has the form (5). 
Frobenius’ theorem is now completely proved. 


5. We now make a few general comments on Frobenius’ theorem. 


Remark 1. In the proof of Frobenius’ theorem we have established 
incidentally that for an irreducible matrix A = O with the maximal charac- 
teristic value r the adjoint matrix B(A) is positive for A—=r: 


B(r)> 0, (35) 


Bz (r)>0 (t, k= sereng a); (35’) 
where By(r) 1s the algebraic complement the element r6,; — aj, in the 
determinant | rf — A |. 
Let us now consider the reduced adjoint matrix (see Vol. I, Chapter IV, 
§ 6) 
B (a) 
Dy-1 (4) ‘ 


C(ay= 


where D,_1(A) is the greatest common divisor of all the polynomials B,,.(A) 
(1,k=1,2,...,n). It follows from (35’) that D, _1(r) +0. All the roots 
of D,-1(A) are characteristic values’® distinct from r. Therefore all the 


16 D,-1(X) is a divisor of the characteristic polynomial Da(A) =| AE -— A |. 
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roots of D,-,(A) either are complex or are real and less than r. Hence 
D,~1(r) > 0 and this, in conjunction with (35), yields :"7 
Bir) 


Ts Dy-1 (7) 


>0O. (36) 
Remark 2. The inequality (35’) enables us to determine bounds for the 
maximal characteristie value r. 
We introduce the notation 


Rn 
$= 2. aj (?=1,2,...,%), s=mins,, S=maxg,. 
= 1l1stsgsa lsign 


Then: For an irreducible matriz A = O 
s=rss, (37) 


and the equality sign on the left or the right of r holds for s=S only; i.e.. 
holds only when all the ‘row-sums’ 81, S2,... 5 8, are equal.'® 


For if we add to the Inst column of the characteristie determinant 


rP—4iy — Aye soe —Qpn | 
— &y} ? — Goo eee —~ @ 
oe i 
| — Any — Gyo eee T—Ann 


all the preceding columis and then expand the determinant w. h respect to 
the elements of the last eoliimn, we obtain: 


> (r—-s,) B,, (r)=-u 


=1 


Hence (37) follows by (25’). 


Remark 3. An irreducihic matriry A= O cannot have two linearly tudc- 
pendent non-neyative chaructcristie vectors. For suppose that, apart from 
the positive characteristic veetor z > 0 corresponding to the maximal chara 
teristic value r, the matrix A has another characteristic vector y = 0 (lin- 
early independent of 2) for the characteristic value «a: 


17 In the following scetion it will be shown for an irreducible matrix B(A) > O, that 
C(A) > O for every real 1 =r. 

18 Narrower bounds for r than (s,S) are established in the papers [256], (295] and 
[119, IV]. 
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Ay=ay (yo; y20). 
Since r is a simple root of the characteristic equation | AE — A |=0, 
| ar. 
We denote by wu the positive characteristic vector of the transposed matrix 
A‘ ford=r: 


A™u=ru (u>o0). 
Then?® 


r(y,u) =(y, A™u) =(Ay, u) =a (y, u); 
hence, as a7, 
(y, u)=09, 
and this is impossible for u > 0, y 2 0, y £0. 


Remark 4. In the proof of Frobenius’ Theorem we have established the 
following characterization of the maximal characteristic value r of an irre- 
ducible matrix A = O: 

r=maxr,, 
(z 20) 


where r, is the largest number 9 for which ex = Az. In other words, since 


- (Az 
1, = min (fz) we have 
lsisn ™% 


__ . (Az); 
r=max min ‘“—". 
(e2ojistsn 


Similarly, we can define for every vector z = 0 (x 0) a number r* as thi 
least number o for which 
ox Ax; 


A 
fv? = max id 
sign 


1.e., we set 


If for some 1 we have here xz,=0, (Ax); 0, then we shall take r* = +00. 
As in the case of the function r,, it turns out here that the function r 
assumes a least value r for some vector v > o. 


Let us show that the number? defined by 


= ; . Az 
r—minet'=min max “* (38 
(e200) =z 1sisn * 
19 Tf y= (Y1, Yo, ..., Yn) and u = (tm, Ue, ..., Un), then we mean by (y, wu) the ‘scala 


ai 
product’, y'u= SY ysus. Then (y, ATw) =yTATu and (Ay, u) = (Ay) Tus yTATu. 
é=] 
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soincides with r and that the vector v = o (v 0) for which this minimum 
iS assumed is a characteristic vector of A for A=r. 
For, 


rvu—Avzo (vzo, vo). 
Suppose now that the sign = cannot be replaced by the equality sign. Then 
by Lemma 1 


n~l (py —A , (B n—1 ; 
Setting (E+ A)*"!(rv—Av)>0, (H+ A) u>0 (39) 


u=(E+ A)*v>0, 
we have 


ru> Au 
and so for sufficiently small e > 0 
(r—e)u> Au {u> 0), 


which contradicts the definition of r. Thus 


But then _ ses 
u = (E + A) vy =(14 1)". 


Therefore u >o implies that » > o. 
Hence, by the Remark 3, 


r=rT. 


Thus we have for r the double characterization : 


r=max min ‘42 max ‘Ath. (40) 


(2°) isisn 7% (@z0) istsn 


Moreover we have shown that max or min is only assumed for a positive 
(x20) (220) 
characteristic vector for A= r. 


From this characterization of r we obtain the inequality” 
A : 
min — <rs max wen (x7 20, =o). (41) 


Remark 5. Since in (40) max and min are only assumed for a posi- 
(z 20) (z 2 0) 


tive characteristic vector of the irreducible matrix A = QO, the inequalities 


20 See [128] and also [17], p. 325 ff. 
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rz Az, 220, zo 
or 
re= Az, 220, 2~o 
always imply that 
Av=rz,z> 0. 


§ 3. Reducible Matrices 


Il. The spectral properties of irreducible non-negative matrices that were 
established in the preceding section are not preserved when we go over to 
reducible matrices However, since every non-negative matrix A = O can 
be represented as the limit of a sequence oi ‘rreducible positive matrices A,, 


A=limA, (4,>0, m=", 2,...), (42) 


me 20 
some of the spectral properties of irreducible natrices huld in a weaker form 
for reducible matrices. 

For an arbitrary pon-negative matrix 4 = ; Ait ili we can prove the 
following theorem : 

THEOREM 3: A non-negative matric A='| ax lif always has a non- 
negative characteristic value r such that the moduli of all the characteristic 
values of A do not exceed r. To this ‘maximal’ characteristic value r there 
corresponds a non-negative characteristec vector 


Ay=ry (yz 0, y 0). 


The adjoint matriz B(A) = | By (A) \\i = (AE — A)—!A(A) satisftes the 
enequalities 


B20,  BU)ZO for Azer. (43) 


Proof. Let A be represented as in (42). We denote by r(™ and y(™ 
the maximal characteristic value of the positive matrix A, and the corre- 
sponding normalized”’ positive characteristic vector : 


A, yl) = rh) of) (Cyl, o™) = 1 of ™ Sor m=1,2,...). (44) 
Then it follows from (42) that the limit 


lim r™) =r 


21 By a normalized vector we mean a column y = (41, Yz,..., Yn) for which (y, y) = 


= 
2 
oa ¥;,=1. 
furl 
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exists, where 7 is a characteristic value of A. From the fact that r(™ >» 9 
and 70) > | Ao’ , where A,(™ is an arbitrary characteristic value of .A,, 
(m= 1, 2,...), we obtain by proceeding to the limit: 


r20,rZl|rA!. 


where A, is an arbitrary characteristic value of A. This passage to the limit 
gives us in place of (35) 


Bi(r) =O. (45) 


Furthermore, from the sequence of normal characteristic vectors y(™ 
(m =1, 2, ...) we can select a subsequence y'"») (p =], 2, ...) that con- 
verges to some normalized (and therefore non-zero) vector y. When we go 


ee we obtain: 
Ay=ry (y2o, yo). 


The inequalities (43) will be established by induction on the order 7. 
For n = 1, they are obvious.”* Let us establish them for a matrix A = 11 Dax ri 
of order 7 on the assumption that they are true for matrices of order less 
than n 


n—] 


A (A) =(A-- Gan) Brn (A) — 3? BE? (A) aig Qap- (46) 


(, k=l 


Expanding the characteristic determinant A(1) =|AF —A| with re- 
spect to the elements of the last row and the last column, we obtain: By’(A). 
Here B,,, (4) =| 46, — a, |2~' is the characteristic eonnmant of a ‘trun- 
cated’ non-negative matrix of order n—1, and BEXA) is the alvebraic 
complement of the element 464, — a. in B,,(A) (4,4 =1,2,...,n—1). The 
maximal non-negative root of Brn(A) will be denoted by r,. Then setting 
A=r, in (46) and observing that by the induction hypothesis 


BY (r,)20 (4, k=1,2,...,m—1), 


we obtain from (46): 
A(rn,) =0. 


On the other hand A(A) =A" +..., so that 4(+o0)=+2. Therefore 
r, either is a root of 4(A) or is less than some real root of 4{A). In both 
cases, 


22 For since B(A) == (AE — A)—'4(A), we have B(A) =E B(A) == O forn=1. 


a 


68 XIII. Matrices with NonN-NEGATIVE ELEMENTS 
for. 


Since every principal minor B,;(A) of order » —1 can be brought into 
the position of B,,(A) by a permutation of A, we have 


rr (j= 1, 2,...,2), (47) 


where r; denotes the maximal root of the polynomial B,(A) (j=1,2,...,”). 

Furthermore, B,,(A) may be represented as a minor of order n —1 of 
the characteristic matrix 4H — A, multiplied by (—1)'t+*. When we differ- 
entiate this determinant with respect to 4, we obtain: 


d . ; ‘ 

ar Bul = 2 By (A) (i, R=1, 2, ...,2—1), (48) 
where B® (4) =|| BY? || (Aj, kG; 7=1, 2,..., 2) is the adjoint matrix 
of the matrix || ay || (4,k=1, 2,...,7—1,j+1,..., ) of order n—1. 


But, by the induction hypothesis, 
BOA) 2O for AZzr;, (j=1,2,...,n); 


and so, by (47) and (48), 
+ B(A)=O for Azr. (49) 


From (45) and (49) it follows thar 
B(A)2O for. =r. 


The proof of the theorem is now complete. 


Not. 1m the passage’ to the limit (42) the inequalities (37) are pre- 
served. They hold, therefore, for an arbitrary non-negative matrix. How- 
ever, the conditions under which the equality sign holds in (37) are not 
valid for a reducible matrix. 


2. A number of important propositions follow from Theorem 3: 


1 IfA= | Dex, it is a non-negative matrix with maximal characteristic 
value r and C(A) is its reduced adjoint matrix, then 


C(A)2ZO for Azr. (50) 
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For . 


BQ 
CA=5— a (51) 


where D,_1(A) is the greatest common divisor of the elements of B(A). 
Since D,-1(A4) divides the characteristic polynomial 4(4) and D,_-1(A) = 
Aer Pn as 

Da-1(A4) >90 for A>r. (52) 


Now (43), (51), and (52) imply (50). 
2. If A= O is an irreducible matrix with maximal characteristic value 
r, then 


B(a) >0, €(4)>0 for AzBr. (58) 
Indeed, by (35) B(r) > O. But also (see (43)) £B(a) =O forsA=r. 


Therefore 
B(A)>O for Az=r. (54) 


The other of the inequalities (53) follows from (51), (52), and (54). 


3. If A= O its an irreducible matriz with maximal characteristic value 
r, then 
(AE—A)~!>0 for A>r. (55) 


This inequality follows from the formula 
_1— B(A) 
(AZ — A) = A(ay’ 


since B(A) > O and A(A) > O for ’>-r. 


4. The maximal characteristic value r of every principal minor®® (of 
order less than n) of a non-negative matrix A= | iz |? does not exceed 
the maximal characteristic value r of A: 


rr. (56) 


If A ts wrreducible, then the equality sign in (56) cannot occur. 
If A is reducible, then the equality sign in (56) holds for at least one 
principal minor. 


3 aes 
23 We mean here by a principal minor the matrix formed from the elements of a prin- 


cipal minor. 
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For the inequality (56) is true for every principal minor of order n — 1 
(see (47)). If A is irreducible. then by (357) By,(r} > 0 (7 =1.2...., 7) 
and therefore 7’ #r. 

By descent from n—1 to n— 2, from n—2 to n— 3, ete., we show 
the truth of (56) for the principal minors of every order. e 

If A is a reducible matrix. ther by means of a permutation it can be 
put into the form 


Then r must be a characteristic value of one of the two principal minors B 
and D. This proves Proposition 4. 
From 4. we deduce: 


5. If A= O and tf in the characteristic determinant 


| f—Qy, 12 — G1, 

Qo, T 2o0 ° vr ao 
A(r)= ~ 
~~ Ont a9 T— Any, 


any principal minor vanishes (A is reducible!), then every ‘augmented’ 
minor also vanishes; in particular, so does one of the principal minors of 
order n—1 

By(4), Boo{d), ..., Ban(A). 


From 4. and 5. we deduce: 


6. A matrix A= O is reducible if and only tf in one of the relations 
B,,(r) 20 (0 =1, 2,..., ”) 
the equality sign holds. 


From 4. we also deduce: 


7. If ris the maximal churacteristic value of a matrix A = O, then for 
every A>rall the principal w4nors of the characteristic matriz A, =AE —A 
are positive : 


eee —— 
A; -)>0 (A>r;, lst <tg<---<t, Sn; p=1,2,...,). (57) 


ty tg... ty 


It is easy to see that, conversely, (57) implies that 4 > r. For 
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Ad+w=|A+wE—Al=|AitpE|= 2'S,u", 


where S; is the sum of all the principal minors of order k of the character- 
istic matrix A, =AE—A (k=1, 2,...,%7).7* Therefore, if for some real 
A alk the principal minors of A) are positive, then for some yp = 0 


A(A+ p) £0, 
i.e., no number greater than A is a characteristic value of A. Therefore 
r<i. 


Thus, (57) is a necessary and sufficient condition for 4 to be an upper 
bound for the moduli of the characteristic values of A.2®> However, the 
inequalities (57) are not all independent. 

The matrix AE — A is a matrix with non-positive elements outside the 
main diagonal. D. M. Kotelyanskii has proved that for such matrices, just 
as for symmetric matrices, all the principal minors are positive, provided 
the successive principal minors are positive.” ; 

Lemma 3 (Kotelyanskil): If ina real matriz G= || gu ||1 all the non- 
diagonal elements are negatwe er zero 


In =9 ((k;1,k=—1,2,...,%) (58) 


and the successive principal minors are positwe 


Sl VhG 0g 59) 
a (;}>°. as NT gy ( 


then all the principal minors are positwe: 


1, to ae te é * a 
Gi... -|>0 (lst, <tg<e+ <4, Sn; p=1,2,..., Mn). 
SO XD 


24 See Vol. I, p. 70. 

25 See [344]. 

26 Jt is easy to see that, conversely, every matrix with negative or zero non-diagonal 
elements can be represented in the form AZ — A, where 4 is a non-negative matrix and A 
is a real number. 

27 Gee [215]. This paper contains a number of results about matrices in which all the 
non-diagonal elements are of like sign. 
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Proof. We shall prove the lemma by induction on the order n of the 
matrix. For n= 2 the lemma holds, since it follows from 


929 Gor S9, Gry >9, 911922 —- 912921 > 9 


that goo > 0. Let us assume now that the lemma is true for matrices of 
order less than 1; we shall then prove it for G= " giz hie AW oceonsider: the 
bordered determinants 


| ; 
tw=0(, A = 9119 xe — GurGir (t,k=2,...,%). 


From (58) and (59) it follows that 
tin = 0 (tx#k32,k =2,..., n). 


On the other hand, by applying Sylvester’s identity (Vol. I. Chapter IT, 
(30), p. 33) to the matrix T= || ty 1/2 , we obtain: 


1, 7 4 
1 3 eee 

r(; s , *\ 
y re one % ¢ 


= (91,)" "G4 ( 


ee eee ) igre queers | i" 


p= 1, , a ree n—l 


] ty So eee t, 


Hence it follows by (59) that the successive principal minors of the matrix 
T = || tu, |[2 are positive: 


t =2(5)>0 ap 3)>9 r() aie "\>0 
= 2 > “\2 3 rere "N23... 80 ; 


Thus, the matrix T= 1 tir IE of order n —1 satisfies the condition of 
the lemma. Therefore by the induction hypothesis all the principal minors 
are positive : 


t) bo ooo F 
bahia ‘ "\>0 (2 Si, <i, <-++ <i, Sn; p=1,2,...,0—1). 
t) te w- by 


But then it follows from (60) that all the principal minors of G containing 


the first row are positive: 


lt, to sae by P ‘ S 
Glo. ")>0 (2Si,<ig<-++ <i, Sn; p=1,2,...,n—D. (61) 
la to soe by " 
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Let us choose fixed indices 1;, to, ..., m—2 (wherel C4 cec...< 
tn —2 X=) and form the matrix of order n — 1: 


|! Ga || (a, B=1, ty, tg, ~ +) Sy_2)- (62) 
The successive principal minors of this matrix are positive, by (61) : 
m>0,0() B)>0,..., (EE 4) so; 
and the non-diagonal elements are non-positive : 
Jap = 9 (a B; a, B=1, ty, 8g, ...5 ty_o)- 


But the order of (62) is n —1. Therefore, by the induction hypothesis, all 
the principal minors of this matrix are positive ; in particular, 


: ry @#e oe 4 

a(? 2 i) >0 (63) 
y te ees %, 

(2St,<8g< eee <4, Sn; p=l, 2, vee, M—2D). 


Thus, all the minors of G of order not exceeding n — 2 are positive. 
Since by (63) go2 > 0, we may now consider the determinants of order 
two bordering the element g22 (and not gi: as before) : 


2% 


n=a(; “} (i, k=1, 3, ..., 2). 


By operating with the matrix T* = || the ||, a8 we have done above with 7, 
we obtain inequalities analogous to (61): 


o(? or *\ 0 


24 vat, (64) 
(ty <Sig< +++ <<a; %,...,%=1,3,...,.0; p=l,2,..., n—1). 
since every principal minor of G= | Oi \[2 contains either the first or 


the second row or is of order not exceeding n — 2, it follows from (61), (63), 
and (64) that all the principal minors of A are positive. This completes 
the proof of the lemma. 

_ This lemma allows us to retain only the successive principal minors in 
the condition (57) and to formulate the following theorem: 


28 See [344] and [215]. Since C= A--AE and A = O, An is real (this follows from 
An + Az==r) and the corresponding characteristic vector of C is non-negative: Cy = Any 
(y 2 0,y #0). 
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THEOREM 4: A fealnumber Ais greater than the maximal characteristic 
value r of the matriz A= | Caz (|i =O 


rack 


if and only tf for this value i all the successive principal minors of the char- 
acteristic matrix A, =AE — A are positive: 


A— ay “~Gyq +--+ —Ayy 
A— — — A— a 
ER eS ss cee ee cee “2 10. (65) 
— @o) A — Ao * ee 68 e« se *e«# e« «© e© e© @ «e# | 
| “ Ony ~~ Ong ey Gan 


Let us consider one application of Tyeorem 4. Suppose that in the matrix 


C = || cx ||? all the non-diagonal elements are non-negative. Then for some 
A>0O we have A=C+AE =O. We arrange the characteristic values 4; 
(,=1, 2,..., n) of C with their real parts in ascending order: 


Re dA, S ReaAasS...S Red,. 


We denote by r the maximal characteristic value of A. Since the charac- 
teristic values of A are the sums 4, + 4 (t= 1, 2,..., 2), we have 


A, tA=r. 


In this case tne inequality r < A holds for 4, < 0 only, and signifies that all 
the characteristic values of C have negative real parts. W-hen we write down 
the inequality (65) for the matrix — C =/1E — A, we obtain the following 
theorem : 


THEOREM 5: The real parts of all the characteristic values of a real 
matric C= || cu ik with non-negative non-diagonal elements 


Cy = 0 (454 k54,k=—1, 2,..., 2) 


are negative tf and only tf 


Cc 0) 
C4, <0, = ue >0, sey (—1)" 


« @ e® e# 8 @ @® 


Cor Cao 
Cal Cro eee Cran 


§ 4. The Normal Form of a Reducible Matrix 


1. We consider an arbitrary reducible matrix A= || a, ||?. By means of a 
permutation we can put it into the form 
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BO 


where B and D are square matrices. 
If one of the matrices B or D is reducible, then it can also be represented 
in a form similar to (67), so that A then assumes the form 


K OO 
A=|H LZ O}}. 
FGM 


If one of the matrices K, L, M is reducible, then the process can be con- 
tinued. Finally, by a suitable permutation we can reduce A to triangular 
block form 


Ay 0... O 
Ay, Ao... O 

As 21 e (68) 
A, A,» A,, 


where the diagonal blocks are square irreducible matrices. 
A diagonal block Ay (114s) is called tsolated if 


Ag=O (k=1,2,...,t—-1¢4+1,..., 4). 


By a permutation of the blocks (see p. 50) in (68) we can put all the 
isolated blocks in the first places along the main diagonal, so that A then 
assumes the form 


Ay O ..O O O 
O A, ..O oO O 

A={ 0O QO  ...4, O .O |, (69) 
Assit. Agia s+ Astt.g Ags; -O fp 


eo «fe e« © © @ @ © @ © © @ © © © & & 8 8 


Ay, A,» ey. Ang, 
here Ai, Az,..., A, are irreducible matrices, and in each row 
Ay; Av; wees Ay, py (f=g+ 1, sey 8) 


at least one matrix is different from‘ zero. 
We shall call the matrix (69) the normal form of the reducible matrix A. 
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Let us show that the normal form of a matrix A ws uniquely determined 
to within a permutation of the blocks and permutations within the diagonal 
blocks (the same for rows and columns).?® For this purpose we consider 
the operator A corresponding to A in an n-dimensional vector space R. To 
the representation of A in the form (69) there corresponds a decomposition 
of R into coordinate subspaces 

R=R,+ R.o+...+R,+Ryiit...t+R,; (70) 


here R,, R,_, + R,, R,2 + R,_1 + Rz,... are invariant coordinate subspaces 
for A, and there is no intermediate invariant subspace between any two 
adjacent ones in this sequence. 

Suppose then that apart from the normal form (69) of the given matrix 
there is another normal form corresponding to another decomposition of R 
into coordinate subspaces: 

“~N “SN ”N ™~ ~s 
R=R,+ Ro+...+R,+Ryait...t+ KR. (71) 
The uniqueness of the normal form will be proved if we can show that the 
decompositions (70) and (71) coincide apart from the order of the terms. 
; “~S 
Suppose that the invariant subspace R, has coordinate vectors in com- 
mon with R,, but not with Ry11,..., R;. Then R; must be entirely con- 
aN 
tained in R,, since otherwise R; would contain a ‘smaller’ invariant sub- 
“mN “N 
space, the intersection of R,; with R, + R41 +...+R,. Moreover, R; must 


“N 
coincide with R,, since otherwise the invariant subspace R;+R,41+...+R, 
would be intermediate between R, + Ry41+...+R, and Ryo, +...4+R,. 


“oN 
Since R,, coincides with R,, R,, is an invariant subspace. Therefore, without 
infringing the normal form of the matrix, R; can be put in the place of R,. 


Thus, we may assume that in (70) and (71) R, = R,. 


“~N 
Let us now consider the coordinate subspace R;_1. Suppose that it has 
coordinate vectors in common with R, (1 < s), but not with R,4;,..., Re. 


“~N mN 
Then the invariant subspace R;_, +R, must be entirely contained in 
R,+ Ri. +...+-R,, since otherwise there would be an invariant coordinate 


™ aN ns aN 
subspace intermediate between R; and R,;_,+R,. Therefore R,,CR:. 
ns “a 
Moreover R;_, =R,, since otherwise R;_, + Ri4; +... +R, would be an 
invariant subspace intermediate between R,+R,,,+...+R, and Rii,+ 
29 Without violating the normal form we can permute the first g blocks arbitrarily 


among each other. Moreover, sometimes certain permutations among the last s — g blocks 
are possible with preservation of the normal form. 
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...+R,. From R,_, = R, it follows that R, + R, is an invariant subspace. 
Therefore R, may be put in the place of R,_, and then we have 


wN ra 
R,41= R,-1 R, =R,. 


, Continuing this process, we finally reach the conclusion that s =? and 
that the decompositions (70) and (71) coincide apart from the order of the 
terms. The corresponding normal forms then coincide to within a permuta- 
tion of the blocks. 

From the uniqueness of the normal form it follows that the numbers 
g and s are invariants of the non-negative matrix A.*° 


2. Making use of the normal form, we shall now prove the following 
theorem : 


THEOREM 6: To the maximal characteristic value r of the matrix A =O 
there belongs a positive characteristic vector if and only if in the normal 
form (69) of A: 1) each of the matrices A,, Az,..., Ag has r as a charac- 
teristic value ; and (in case g < s) 2) none of the matrices Aji1,..., Ag has 
this property. 

Proof. 1. Let z> a be a positive characteristic vector belonging to the 
maximal characteristic value r. In aceordance with the dissection into 
blocks in (69) we dissect the column z into parts 2* (k=1, 2,...,s). Then 
the equation 


Az=rz (z>0) (72) 


is replaced by two systems of equations 


Ag=rt (i= 1,2, ..., 9), (72") 
j—1 
& Ane + Ag re (j=g+1,..., 8). (72”) 


From (72’) it follows that 7 is a characteristic value of each of the 
matrices A;, Ao,..., Az. From (72’) we find: 


Ad<srd, AdArd (j=gtl,..., a). (73) 
We denote by 7; the maximal characteristic value of A; (j=g9+1,...,8). 
Then (see (41) on p. 65) we find from (73) : 
ry max “5 esr Gap ae: 


hoe ee eee 
30 For an irreducible matrix, g==s—=1. 
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On the other hand, the equation r;—=r would contradict the second of the 
relations (73) (see Note 5 on p. 65). Therefore 


<r (j= gtl,..., 8). (74) 


2. Suppose now, conversely, that the maximal characteristic values of 
the matrices A; (1=1, 2,..., g) are equal to r and that (74) holds for the 
matrices A; (7=g+1,..., 8). Then by replacing the required equation 
(72) by the systems (72’), (72) we can define positive characteristic col- 
umns 2‘ of the matrices A; (¢=1, 2,..., g) by means of (72’). Next we 
find columns 2 (j=g9+1,...,8) from (72”): 


Zi= (rly — Ay 5 Aye (j=g+t+1,..., 8), ' (76) 
where E; is the unit matrix of the same order as A, ‘j=g+1...., S). 
Since 7; << r (j=9 +1,..., 5), we have (see (55) on p. 69) 
(Ej A)>0 (jagt+l,...,¢). (76) 
Let us prove by induction that the columns 29*?,.... 2 defined by (75) 


are positive. We shall show that for every ) (g +1)358) the fact that 
2! z?,..., 2/—! are positive implies that z/ > o. Indeed, in this ease, 


jo fol 
Pa Ay? = Oo, a, Anz Ao, 
hel a=l 
which in conjunction with (76) yields, by (75) : 
2i> 0. 
Thus, the positive column z= (z!,..., 2°) is a characteristic vector of A 


for the characteristic value r. This completes the proof of the theorem. 


3. The following theorem gives a characterization of a matrix A = O 
which together with its transpose A™ has the property that a positive char- 
acteristic vector belongs to the maximal characteristic value. 


THEOREM 7:*' To the maximal characteristic value r of a matriz,A 2 O 
there belongs a positive characteristic vector both of A and of A‘ if and only 
if A can be represented by a permutation in quasi-diagonal form 


A={A,, Az, ..-, 4,)}; (77) 


where Ai, Az2,..., As are irreducible matrices each of which has r as its 
maximal characteristic value. 


81 See [166]. 
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Proof. Suppose that A and A’ have positive characteristic vectors for 
A=‘r. Then, by Theorem 6, A is representable in the normal form (69), 
where A,, Ao,..., A, have r as maximal characteristic value and (for g < s) 
the maximal characteristic values of A,4;,..., A, are less thanr. Then 


: AN iO Arias ee Ag 
ym ee eee 

0... O Ax, 
Oo... 0 O eke de 


Let us reverse here the order of the blocks in this matrix: 


A OO 0...0 
AT,, 47, O...0 


(78) 


Ai Aj-33: bey 


Since A), AS_,,..., A; are irreducible, we obtain a normal form for (78) 
by a permutation of the blocks, placing the isolated blocks first along the 
main diagonal. One of these isolated blocks is A; . Since the normal form 
of A’ must satisfy the conditions of the preceding theorem, the maximal 
characteristic value of A; must be equal to r. This is only possible when 
g=s. But then the normal form (69) goes over into (77). 

If, conversely, a representation (77) of A is given, ther, 


A*={ A], Aj, ..., Aj}. (79) 


We then deduce from (77) and (79), by the preceding theorem, that A and 
A‘ have positive characteristic vectors for the maximal characteristic value r. 
This proves the theorem. 


Corotuary. If the maximal characteristic value r of a matrix AZO 
is simple and if positive characteristic vectors belong to r both in A and A‘, 
then A ts trreducible. 
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Since, conversely, every irreducible matrix has the properties of this 
corollary, these properties provide a spectral characterization of an irre- 
ducible non-negative matrix. 


§ 5. Primitive and Imprimitive Matrices 


1. We begin with a classification of irreducible matrices. 


DEFINITION 3: If an irreducible matrix A =O has h characteristic 
values Ay, Ao, ..., dn of maximal modulus r (4, =| d2|=...= | An, | =r), 
then A is called primitive if h=1 and wmmprimitive if h > 1. his called the 
index of wnprimitivity of A. 

The index of imprimitivity h is easily determined if the coefficients of 
the characteristic equation of the matrix are known 


A (A) A" + ay" + ad +++ + ad" =0 
(n>n,> coe > hy a, ~0, a, 0, cers a,70) ; 


namely : h 1s the greatest common divisor of the differences 
N— Ny, Ny — Ny, 204, My —M%- (80) 


For by Frobenius’ theorem the spectrum of A in the complex A-plane 
goes over into itself under a rotation through 22/h around the point 4=0. 
Therefore the polynomial 4(1) must be obtained from some polynomial 
g(u) by the formula 

A (A) = 9 (A*) 2”. 


Hence it follows that A is a common divisor of the differences (80). But 
then h is the greatest common divisor d of these differences, since the spec- 
trum does not change under a rotation by 22/d, which is impossible forh < d. 
The following theorem establishes an important property of a primitive 
matrix: 
THEOREM 8: A matrix A = O is primitive if and only if some power of. 
A is positive : 
A*>O (pZ}). (81) 


Proof. If A®>O, then A is irreducible, since the reducibility of A 
would imply that of A’. Moreover, for A we have h=1, since otherwise 
the positive matrix A? would have h (> 1) characteristic values 


AP, AP, 2. , AB 


of maximal modulus r?, and this contradicts Perron’s theorem. 
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Suppose now, conversely, that A 1s Pee 
23) of Chapter V (Vol. I, p. 107) to A 


s 1 mils (A) AP|(me—1) 
A= a ee 
Am Gm — OY Sa lee 


We apply the formula 


(82) 
‘where 
y (A) = (A—A,)™ (A—Ag)™*** (A — A,)ms (A; A, for 754f) 


= a J 0?) 
is the minimal polynomial of A, y(4) = (4—i,jm, (4 =1, 2,...,8) and C(A) 


= (AE — A)-‘y(A) is.the reduced adjoint matrix. 
In this case, we can set: 


A=r>\al2°°°2|4,| and m,=1. (83) 


Then (82) assumes the form 


1 C (A) apy e-1) 
y’(r) F | 


A= yrt ea 
y (a) 


Hence it is easy to deduce by (83) that 


AP — Er) 


ot ee wir)” (84) 


On the other hand, C(r) > O (see (53) ) and y’(r) > O by (83). There- 
fore 


and so (73) must hold from some p onwards.??. This completes the proof. 
We shall now prove the following theorem: 


THEOREM 9: If A =O is an irreducible matrix and some power A? of 
A 1s reducible, then A? is completely reducible, i.e., A? can be represented 
by means of a permutation in the form 


Af ={A,, Ag, oaey A,}; (85) 


where A;, Ao, ..., Ag are trreducible matrices having one and. the same 
maximal characteristic value. Here d 1s the greatest common divisor of q 
and h, where his the index of imprimitivity of A. 


32 Ags regards a lower bound for the exponent p in (81), see [384]. 
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Proof. Since A is irreducible, we know by Frobenius” theorem that 
positive characteristic vectors belong to the maximal characteristic value 7, 
both in A and in A‘. But then these positive vectors are also characteristic 
vectors of the non-negative matrices A? and (.19)' for the characteristic 
value A4=r?. Therefore by applying Theorem 7 to .1%, we represent this 
matrix (after a suitable permutation) in the form (51, where Aj, do,..., Ag 
are irreducible matrices with the same maximal characteristic value r9. 
But A has h characteristic values of maxima} modulus r: 


fe, ..c,7Te! se Se 
Therefore A? also has A characteristic values of maximal modulus 


rl, Tet, Tet) 

among which d are equal to r?. This is only possible when cd is the greatest 
common divisor of g and h. This proves the theorem. 

For h=1, we obtain: 

CoroLLaRy 1: A power of a primitive matrir is irreducible and primi- 
toe. 

If we set g=h in the theorem, then we obtain: 

Corouuary 2: If Atsan imprimitive matrix with indec of imprimitivity 
h, then A” splits into h primitive matrices with the same maximal charac- 
teristic value. 


§ 6. Stochastic Matrices 


1. We consider 7 possible states of a certain system 


Ss, So, oeey Si, (86) 
and a sequence of instants 
ee eee 


Suppose that at each of these instants the system is in one and only one 
of the states (86) and that p,; denotes the probability of finding the system 
in the state S; at the instant ¢, if it is known that at the preceding instant 
t,—- 1 the system is in the state 8, (7, j=1,2,...,n;k=1,2,...). Weshall 
assume that the transition probability py (1,7 =1, 2,..., 2) does not depend 
on the index & (of the instant ¢;). 

If the matrix of transition probabilities is given, 


P= || po |, 
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then we say that we have a homogeneous Markov chain with a finite number 
of states.*? It is obvious that 


Py20, DX py=1 (6, 7=1,2,..., 2). (87) 
)= 
‘DEFINITION 4: A square matrix P= || pj; ||7 ts called stochastic of P is 


non-negative and tf the sum of the elements of each row of P 1s 1, i.e., of the 
relations (87) hold.** 


Thus, for every homogeneous Markov chain the matrix of transition 
probabilities is stochastic and, conversely, every stochastic matrix can be 
regarded as the matrix of transition probabilities of some homogeneous 
Markov chain. This is the basis of the matrix method of investigating homo- 
geneous Markov chains.** 

A stochastic matrix is a special form of a non-negative matrix. There- 
fore all the concepts and propositions of the preceding sections are applicable 
to it. 

We mention some specific properties of a stochastic matrix. From the 
definition of a stochastic matrix it follows that it has the characteristic value 
1 with the positive characteristic vector z= (1, 1,..., 1). It is easy to see 
that, conversely, every matrix P =O having the characteristic vector 
(1,1,...,1) for the characteristic value 1 is stochastic. Moreover, 1 is the 
maximal characteristic value of a stochastic matrix, since the maximal char- 
acteristic value is always included between the largest and the smallest of 
the row sums** and in a stochastic matrix all the row sums are 1. Thus, 
we have proved the proposition : 


1. A non-negatwe matrir P = O 1s stochastic if and only tf it has the 
characteristic vector (1, 1, ..., 1) for the characteristic value 1. For a 
stochastic matrix the maximal characteristic value is 1. 


Now let A= || au, ||t be a non-negative matrix with a positive maximal 
characteristic value r > 0 and a corresponding positive characteristic vector 
2@== (21, Z2,..., 2) >: 


33 See [212] and [46], pp. 9-12. 
| 
34 Sometimes the additional condition PAT ~0 (j=1, 2,..., ) is included in the 


€ aw] 
definition of a stochastic matrix. See [46], p. 13. 


35 The theory of homogeneous Markov chains with a finite (and a countable) number 
of states was introduced by Kolmogorov (see [212]). The reader can find an account 
of the later introduction and development of the matrix method with applications to 
homogeneous Markov chains in the memoir [329] and in the monograph [46] by V. I. 
Romanovskii (see also [4], Appendix 5). 


86 See (37) and the note on p. 68. 
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|| 
A a= 1% (t=1,2,...,m). (88) 
We introduce the diagonal matrix Z= {z), zo, ..., 2n.} and the matrix 
P= | Daj ik 
P=—Z4AZ. 
Then ; 
Py 21 Ay%, = 0 (2, 7=1, 2, ae) n), 
and by (88) 
AP =I (§=1,2,...,2). 
Thus: 


2. A non-negative matric A = O with the maximal positive characteristic 
value r>O0 and with a corresponding positive characteristic vector 
2 = (2,22,..- 2n) > 018 similar to the product of r and a stochastic matriz :*' 


A=ZrPZ* (Z={%, 2g, -. +, Zn} > O). (89) 


In a preceding section we have given (see Theorem 6, § 4) a characteriza- 
tion of the class of non-negative matrices having a positive characteristic 
vector for 4=r. The formula (89) establishes a close connection between 
this class and the class of stochastic matrices. 


2. We shall now prove the following theorem : 
THEOREM 10: To the characteristic value 1 of a stochastic matrix there 
always correspond only elementary divisors of the first degree. 
Proof. We apply the decomposition (69), § 4, to the stochastic matrix \ 
P= | Py ik 
Ay Ow... eee eee eee e O 


O- Axiow & acebh-& 8.266 2 oO 
P=| oO A, O o |, 

y eer ae renee er ) 

Ay ode eee . A, 


where A;, Az,..., A; are irreducible and 


37 Proposition 2. also holds for r= 0, since 4 = O, ¢ > o implies that 4 = 0. 
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Ay, + Aya +o+++ Ay yO (f=gt+1,...,8). 


Here A1, Ao, ..., Ag are stochastic matrices, so that each has the simple 
characteristic value 1. As regards the remaining irreducible matrices 
Ag..1,---, As, by the Remark 2 on p. 63 their maximal characteristic values 
are less than 1, since in each of these matrices at least one row sum is less 
than 1.** 

Thus, the matrix P is representable in the form 


p=(* a 
S Q. 


where in Q, to the value 1 there correspond elementary divisors of the first 
degree, and where 1 is not a characteristic value of Qo. The theorem now 
follows immediately from the following lemma: 


Lemma 4: If a matric A has the form 
O 
A= . ; (90) 
S @Q 
where Q, and Qe are square matrices, and if the characteristic value A of 4 
as also a characteristic value of Qi, but not of Qo, ~ 
(QO; —4,#|=0, \Q, —A, E|~0, 
then the elementary divisors of A and Q, corresponding to the characteristic 


value Ao are the same. 


Proof. 1. To begin with, we consider the case where Q; and Q2 do not 
have characteristic values in common. Let us show that in this case the 
elementary divisors of Q, and @Q2 together form the system of elementary 
divisors of A, i.e., for some matrix T (| T | #0) 


rar =( ae (91) 
O QQ: 


We shall look for the matrix T in the form 
ra(B 9) 
U &£, 


38 These properties of the matrices 4:,..., A. also follow from Theorem 6. 
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(the dissection of T into blocks corresponds to that of A; E; and E> are unit 
matrices). Then 


a=(7 ch be °( E, °)=( Q1 a 91’) 
TAT = U E, S Q2 —U E, -. UQ,—@,U +8 Qs 


The equation (91’) reduces to (91) if we choose the rectangular matrix 
U so that it satisfies the matrix equation 


9,0 —UQ,=8. 


If Q, and Qz have no characteristic values in common, then this equation 
always has a unique solution for every right-hand side S (see Vol. I, Chapter 
VIII, § 3). 

2. In the case where Q, and Q» have’ characteristic values in common, 
we replace Q; in (90) by its Jordan form J (as a result, A is replaced by a 
similar matrix). Let J={J;J2}, where all the Jordan blocks with the 
characteristic value 4) are combined in J,;. Then 


O J, OO Oi. aes 

A= 0 0) fo Pe 
Sy Si Q | Sn: Qa 
So) Ses ; 3 Soi : 


“™ 

This matrix falls under the preceding case, since the matrices J; and Q2 have 
no characteristic values in common. Hence it follows that the elementary 
divisors of the form (A — dy)? are the same for A and J; and therefore also 
for A and Q;. This proves the lemma. 

If an irreducible stochastic matrix P has a complex characteristic value 
Ao with | 4. |=1, then A,P is similar to P (see (16)) and so it follows from 
Theorem 10 that to A, there correspond only elementary divisors of the first 
degree. With the help of the normal form and of Lemma 4 it is easy to 
extend this statement to reducible stochastic matrices. Thus we obtain: 


Corouuary 1. If A, %s a characteristic value of a stochastic matrix P and 
| A, |= 1, then the elementary divisors corresponding to A, are of the first 
degree. 

From Theorem 10 we also deduce by 2. (p. 84): 


Corouuary 2. If a positive characteristic vector belongs to the maximal 
characteristrc value r of a non-negative matrix A, then all the elementary 
divisors of A that belong to a characteristic value 1) with | 4, |=r are of the 
forst degree. 
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We shail now mention some papers that deal with the distribution of the 
characteristic values of stochastic matrices. 

A characteristic value of a stochastic matrix P always hes in the dise 

| A | <1of the A-plane. The set of all points of this disc that are character- 
istic values of any stochastic matrices of order nm will be denoted by M,. 
3. In 1938, in connection with investigation on Markov chains A. N. Kol- 
mogorov raised the problem of determining the structure of the domain M,. 
This problem was partially solved in 1945 by N. A. Dmitriev and E. B. 
Dynkin [133], [133a] and completely in 1951 in a paper by F. I. Karpelevich 
[209]. It turned out that the boundary of M, consists of a finite number 
of points on the circle |2 [= 1 and certain curvilinear arcs joining these 
points in cyclic order. 

We note that by Proposition 2. (p. 84) the characteristic values of the 
matrices A = || a; ||{ 2 O having a positive characteristic vector for 4=r 
with a fixed r form the set r*M,.°° Since every matrix A= |j ax, 4 =O 
can be regarded as the limit of a sequence of non-negative matrices of that 
type and the set r° VY, is closed, the characteristic values of arbitrary matrices 
A= | diz 7 => QO with a given maximal characteristic value r fill out the 
set r° M,,.*° 

A paper by H. R. Suleimanova [359] is relevant in this context; it con- 
tains sufficiency criteria for » given real numbers 4;, do, ..., An to be the 
characteristic values of a stochastic matrix P= | Pij | ie 


§ 7. Limiting Probabilities for a Homogeneous Markov Chain 
with a Finite Number of States 


1. Let 
81, So, ..., Sr 


be all the possible states of a system in a homogeneous Markov chain and let 
P= 1 Dij || be the stochastic matrix determined by this chain that is formed 
from the transition probabilities p,; (4,7 == 1, 2,...,) (see p. 82). 

We denote by py the probability of finding the system in the state S, at 
the instant ¢, if it is known that at the instant ¢,_, it is inthestate 8, 


(i, j=1, 2,..., n; q=1,2,...). Clearly, p= py (4, 7=1, 2,..., n). 


39 ye M, is the set of points in the A-plane of the form rp, where ut € Mn. 

40 Kolmogorov has shown (see [133a (1946)], Appendix) that this problem for an 
arbitrary matrix 4 = O can be reduced to the analogous problem for a stochastic matrix. 

#1 See also [312]. 
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Making use of the theorems on the addition and multiplication of probabili- 
ties, we find easily : 


A 
1 s s 
py’ a PY Pry (8, Los 1, 2, ee | 2) 


or, in matrix notation, 


log | =P IA I psll 


Hence, by giving to q in succession the values 1], 2,..., we obtain the impor- 
tant formula* 
) 
oP |[=P* (g=1,2,...). 
If the limits 


lim yy = Pj (t,7=1, 2, ..., 2) 


or, in matrix notation, 
lim P? = P™ =|| pj jh, 


q->00 
exist, then the values Py; (1,7=1, 2,..., ) are called the limeteng or frnal 
transition probabilties.* 

In order to investigate under what conditions limiting transition proba- 
bilities exist and to derive the corresponding formulas, we introduce the fol- 
lowing terminology. 

We shall call a stochastic matrix P and the corresponding homogeneous 
Markov chain regular if P has no characteristic values of modulus 1 other 
than 1 itself and fully regular if, in addition, 1 is a simple root of the 
characteristic equation of P. 

A regular matrix P is characterized bv the fact that in its normal form 
(69) (p. 75) the matrices A;, Ag,..., 4, are primitive. For a fully regular 
matrix we have, in addition, g = 1. 

Furthermore, a homogeneous Markov chain is trreductble, reducible, 
acyclic or cyclic if the stochastic matrix P of the chain is irreducible, reduc- 
ible, primitive, or imprimitive, respectively. Just as a primitive stochastic 
matrix is a special form of a regular matrix, so an acyclic Markov chain is a 
special form of a regular chain. — 

We shall prove that: Limiting transition probabilitres exist for regular 
homogeneous Markov chatns only. 


42 It follows from this formula that the probabilities p?) as well as pyy (i, j==1, 2, 
3,...,%; @=1, 2,...) do not depend on the index k of the origina] instant tx. 


43 The matrix P@, as,the limit of stochastic matrices, is itself stochastic. 


§ 7. Limitine ProsaBinities ror Markov CHAIN 89 


For let y(A) be the minimal polynomial of the regular matrix P= || py ||1- 
Then 


y (A) = (A Ag (A— Aa) A A™ Ades t= 1,2, -..,4). (92) 
By Theorem 10 we may assume that 
A,=1, m,=1. (93) 


By the formula (23) of Chapter V (Vol. I, p. 107), 


(1) I C (a) 4g] 
Pr=— t+ J _ | #! ; (94) 
» (1) Om ae F (a) | 


AmApz 


where C'(A) = (AE —J )—'p(A) is the reduced adjoint matrix and 


b(y= oO (k=1,2,..., 0); 


— A,)™ 
moreover 


ypa=2O and yQ)=y(). 


If P is a regular matrix, then 
|A.|<1 (K=2,3,..., 4), 


and therefore all the terms on the right-hand side of (94), except the first, 
tend to zero for g—> o. Therefore, for a regular matrix P the matrix P” 
formed from the limiting transition probabilities exists, and 

C (1) 


P = y’ (1) . (95) 


The converse proposition is obvious. If the limit 


P*= lim Pt (98) 

q-ro 
exists, then the matrix P cannot have any characteristic value A, for which 
Ay 1 and | A, |=1, since then the limit lim 4% would not exist. (This 


goo 
limit must exist, since the limit (96) exists.) 
We have proved that the matrix P* exists for a regular homogeneous 
Markov chain (and for such‘a regular chain only). This matrix is deter- 
mined by (95). 
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We shall now show that P®@ can be expressed by the characteristic poly- 
nomial 


A (A) = (A—Ay)™ (A — Ag) +++ (A— A) (97) 


and the adjoint matrix B(A) = (AF — P)—1A(A). 
From the identity 
Ba) @(a) 


—_— 
ah ——— a 


A(ay (A) 
it follows by (92), (93), and (97) that 


n,Biu—t) (1) (1) 
Aim) (1) y’(i)’ 


Therefore (95) may be replaced by the formula 


n, Bens) (1) 


Po = —“Aed (1) 


(98) 

For a fully regular Markov chain, inasmuch as it is a special form of a 

regular chain, the matrix P* exists and is determined by (95) or (98). In 
this case n; = 1, and (98) assumes the form 
wo -. B(l) 

Pp? = A (ly’ (99) 

2. Let us consider a regular chain of general type (not fully regular). We 

write the corresponding matrix P in the normal form 


Od: Sih be O. WA ako, He a O 
O QO O 
P=F Uys 2s Gytig Opts ; , (100) | 


ee Ui hes & Dee. 
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where @,,..., Q, are primitive stochastic matrices and the maximal values 
of the irreducible matrices, Q,,:,..., Q, are less fhan 1. Setting 


See ee re Qi ---O 
= & ee et oe eS ee ae ca oe ae ' 
U4; g. 36 eh Die ba Q, 
we write P in the form 
Q1 . O O 
P= 
0...Q@ 0 
U W 
Then 
OH -.O O 
Pi= (101) 
O bans Qt O 
ua q 
ae U W' 
OF. aasO O 
P= lim P?¥= : 
q-> 00 bs ° 
O° se Q= O 
U.. 
But W? =lim W?= 0, because all the characteristic values of W are of 
G-> oo 
modulus less than 1. Therefore 
OF na O O 
pe= 3 (102). 
O ’ a O 


Us O 


Since Qi,..., Q, are primitive stochastic matrices, the matrices Q7,..., 
Q;° are positive. by (99) and (35) (p. 62) - 
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Qr>0,...,Q7>0, 


and in each of these matrices all the elements belonging to any one column 
are equal: 


G7 =|} @=L 2-0. 


We note that the states S,, So,..., S, of the system fall into groups cor- 
responding to the normal form (100) of P: 


DS hata ee ee (103) 


To each group 2 in (103) there corresponds a group of rows in (100). 
In the terminology of Kolmogorov the states of the system that occur in 
21) 2o)---+, 29 are called essential and the states that occur in the remaining 
gToupS 2713,---, ds non-essential. 

From the form (101) of P? it follows that in any finite number q of steps 
(from the instant t,_, to t,) only the following transitions of the system are 
possible: a) from an essential state to an essential state of the same group; 
b) from a non-essential state to an essential state; and c) from a non-essential 
state to a non-essential state of the same or a preceding group. 

From the form (102) of P® it follows that: A limiting transition can 
only lead from an arbitrary state to un essential state, 1.e., the probability 
of transition to any non-essential state tends to zero when the number of 
steps q tends to infinity. The essential states are therefore sometimes also 
called lamiting states. 


3. From (95) it follows that* 
(HE —P)P*=0O. 


Hence it is clear that: Every column of P® is a characterisitc vector of the 
stochastic matrix P for the characteristic value A= 1. 

For a fully regular matrix P, 1 is a simple root of the characteristic equa- 
tion and (apart from scalar factors) only one characteristic vector (1. 1,...,1) 
of P belongs to it. Therefore all the elements of the j-th column of P* 
are equal to one and the same non-negative number p,\: 


nT 
pe=pz20 (j=1,2,..., 0; 2’ pos =). (104) 
. j= 


44 See [212] and [46], pp. 37-39. 
45 This formula holds for an arbit:-ary regular chain and can be obtained from the 
obvious equation P?7 — P* PI-1= O b;: passing to the limit g> oo. 
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Thus, in a fully regular chain the limiting transition probabilities do 
not depend on the initial state. 

Conversely, if in a regular homogeneous Markov chain the limiting 
transition probabilities do not depend on the initial state, i.e., if (104) holds, 
then obviously in the scheme (102) for P@ we have g=1. But then n,; =1 
and the chain is fuliy regular. 

For an acyclic chain, which is a special case of a fully reguiar chain, 
P is a primitive matrix. Therefore P? > O (see Theorem 8 on p. 80) for 
some q>0. But then also P* = P* Ps > 0.*° 

Conversely, it follows from P* > O that P4 > O for some gq > 0, and 
this means by Theorem 8 that P is primitive and hence that the given homo- 
geneous Markov chain is acyclic. 

We formulate these results in the following theorem: 

THEOREM 11: 1. Inahomogeneous Markov chain all the limiting transi- 
tion probabilities exist if and only tf the chain ts regular. In that case the 
matrix P? formed from the lumiting transition probabilities ts determined 
by (95) or (98). 

2. Inaregular homogeneous Markov chain the limiting transition vroba- 
bilittes are independent of the ential state rf and only af the chain is fully 
regular, In that case the matriz P@ 1s determined by (99). 

3. Ina regular homogeneous Markov chan all the limiting transition 
probabilities are different from zero tf and only if the chain is acyclic.*" 


4, We now consider the columns of absolute probabilities 


k k ok k 
P= (Py, Par +++» Pn) (k=0, 1, 2, ...), (105) 


where Pi is the probability of finding the system in the state S; («= 1, 2,..., 
n; k=0, 1, 2, ...) at the instant f,. Making use of the theorems on the 
addition and multiplication of probabilities, we find: 


E 7 0 : 
Dy = 2 PaPrw (G=1, 2,...,n;4=1,2,...) 


or, in matrix notation, 


46 This matrix equation is obtained by passing to the limit m—> => from the equation 
Pm— Pm-a. PY (m>q). P® is a stochastic matrix; therefore P”™ = O and there are 
non-zero elements in every row of P®™. Hence P% P¢ > 0. Instead of Theorem 8 we can 
use here the formula (99) and the inequality (35) (p. 62). 

47 Note that P= > O implies that the chain is acyclic and therefore regular. Hence it 


follows automatically from P* >O that the limiting transition probabilities do not 
depend on the initial state, i.e., that the formulas (104) hold. 
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p=(P'¥p (k=1,2,...), (106) 


Where PT is the transpose of P. 
All the absolute probabilities (105) can be determined from (106) if the 


Mitial probabilities Di, 0, ene Dn and the matrix of transition probabilities 
= || pi ||t are known. 
e introduce the limiting absolute probabilities 


p, = lim 7, (,=1, 2,..., 2) 
or k => oo 


eo eo eo eo : k 
P= (Py) Pa, +++ Da) = lim Dp. 
When we take the limit k > © on both sides of (106), we obtain: 


p=(P~)"p. (107) 


Note that the existence of the matrix of limiting transition probabilities 


Ps implies the existence of the limiting absolute probabilities 


oo 


P = (Pi, Par - ++» Dn) 


. at eae re 0 ‘ 
for arbitrary initial probabilities p= (pr, Do, ..., Dn), and vice versa. 
From the formula (107) and the form (102) of P® it follows that: The 
limiting absolute probabilities corresponding to non-essential states are zero. 


Multiplying both sides of the matrix equation 
P* -(P=)* =(P=)" 
by p on the right, we obtain by (107): 
P"p=p, (108) 


i.e.: The column of limiting absolute probabilities p is a characteristic vector 
of P™ for the characteristic value A= 1. 

Tf a fully regular Markov chain is given, then 4 =:1 is a simple root of 
the characteristic equation of P™. In this case, the column of limiting abso- 


lute probabilities is uniquely determined by (108) (because p, =0 (j= 
" ao 
1, 2, oer gy Nn) and 2: pj =). 
j= 
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Suppose that a fully regular Markov chain is given. Then it follows 
from (104) and (107) that: 


oo- ah Os = n 0 ' ee : Jn 3 
Pp ae PhPhj = Pi Ph = Poi G12). 2045). (109) 


In this case the limiting absolute probabilities Dr» Pas ee Pr, do not 
depend on the initial probabilities Pp Po; edie Pa 


Conversely, p is independent of D on account of (107) if and only if all 
the rows of P® are equal, i.e., 


Prj = Poi (A, j=1, 2, oes 2) 


so that (by Theorem 11) P is a fully regular matrix. 
If P is primitive, then P® > O and hence, by (109), 


P>0 (fF =1,2,...,2). 


Conversely, if all the p; (j=1, 2,....%) are positive and do not depend 
on the initial probabilities, then all the elements in every column of P® are 
equal and by (109) P* > QO, and this means by Theorem 11 that P is primi- 
tive, i.e., that the given chain is acyclic. 

From these remarks it follows that Theorem 11 can also be formulated 
as follows: 


THEOREM 11’: 1. In a homogeneous Markov chain all the limiting abso- 
lute probabilities exist for arbitrary initial probabilities 1f and only tf the 
chain is regular. 

2. In a homogeneous Markov chain the lamiting absolute probabilities. 
exist for arbitrary initial probabilities and are independent of them 1f and 
only if the chaan 1s fully regular. 

3. Ina homogeneous Markov chain positive limiting absolute probabili- 
ties exist for arbitrary imtial probabilities und are indenendent of them tf 
and only tf the chain ts acyclic.* 


>. We now consider a homogeneous Markov chain of general type with a 
matrix P of transition probabilities. 


48 The second part of Theorem 11’ is sometimes called the ergodic theorem and the 
first part the general quasi-ergodic theorem for homogeneous Markov chains (see [4], 
pp. 473 and 476). 
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We choose the normal form (69) for P and denote by hy, ho, ..., hy the 
indices of imprimitivity of the matrices A;, As,..., dg in (69). Let h be 
the least common multiple of the integers Ay, ho, ..., hy. Then the matrix 
P* has no characteristic values, other than 1, of modulus 1. i.e., P* is regular ; 
here A is the least exponent for which P* is regular. We shall call h the 
perwa of the given homogeneous Markov chain. 

Since P* is regular, the limit 

lim P¥ =(P')- 


gq 00 


exists and hence the limits 


P.=lim Prt =pr(py= — (r=0,1,...,4—1) 


g-» co 
also exist. 
Thus, in general, the sequence of matrices 
PPP PE 58 og 
splits into h subsequences with the limits P, = PtP") oe 0:1, yh —1l1). 


When we go from the transition probabilities to the ahonee probabili: 
ties by means of (106), we find that the sequence 


1 2 3 
P, DP, Ps ++ 


splits into h subsequences with the limits 


limp =(P™)"p (r=0,1,2,...,k—1). 


q~ 00 


For an arbitrary homogeneous Markov chain with a finite number of 
states the limits of the arithmetic means alwavs exist: 


P =lim 1 yp =_- (B+ P+ +++ + Ph) (P> (110) 


N-> 00 k=l 


and 
ee ee: 0 
p Wm P=P'p. (110’) 
Here P= || py ||} and B= (Br, Be, ..., Ba). The values 3, (i, j=1, 2, 


3,..., %) and p; (j= 1, 2,..., n) are called the mean limiting transition 
probabilitves and mean lamiting absolute probabilities, respectively. 
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Since 
lim = 7 pe lim + op 
— =] nice k 
N-voo NX NV-> 00 We P : 
ha & = 
we have Bp=3 
and therefore, by (110’), 
rs (111) 


P'p=p; 


ie, p is a characteristic vector of P’fordA=1. 
Note that by (69) and (110) we may represent P in the form 


A,.0...0 
0 A,. . .00 
a ee ee a Ds 
00...A4, 
U 2UOUW 
where 
1 v= 1 
A,=lim 5 24; (6=1,2,...9) W=lim x 2 w+, 
N 0 k=l NV = oo k=l 
BAGO! 65 vO 
W= * Aj+2 O 
* * A, 


Since all the characteristic values of W are of modulus less than 1, we 


have 
lm Wt =O, 
Sy E~p 00 
and therefore W = QO. 
Hence 
A,O...0 
0 A,...00 
P= see ; (112) 
00...A, 
U O 


Since P is a stochastic matrix, the matrices A,, Az, ..., A, are also 


stochastic. 
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From this representation of P and from (107) is follows that: The mean 
limiting absolute probabilities corresponding to non-essential states are 
always zero. 

If g=1 in the normal form of P, then A= 1 is a simple characteristic 
value of PT. 

In this case p is uniquely determined by (111). and the mean limiting 
probabilities $1, Be, ..., Bx do not depend on .the initial probabilities 


Ps Pos ssn Pa: Conversely. if » does not depend on D. then P is of rank 1 
by (110’). But the rank of (112) can be 1 only if g=1. 


We formulate these results in the following theorem :*° 

THEOREM 12: Foran arbitrary homogeneous Merkov chain with period 
h the probability matrices P* and p tend toa periodic repetition with period 
h for k— «©; moreover, the mean hmiting transition probabilities and the 
absolute probabilities P= || Pi 11 and p= (Pi. Po... .. Pn) defined by (110) 
and (110’) always exist. 

The mean absolute probabilities corresponding to non-essential states are 
always zero. 

If g=1 in the normal form of P (and only in this case), the mean limit- 
ing absolute probabilities pi, Po, ..., Dn are independent of the initial proba- 


bilities Pu Pes bit Pa and are uniquely determined by (111). 


§ 8. Totally Non-negative Matrices 


In this and the following sections we consider real matrices in which not 
only the elements, but also all the minors of every order are non-negative. 
Such matrices have important applications in the theory of small oscilla- 
tions of elastic systems. The reader will find a detailed study of these 
matrices and their applications in the book [17]. Here we shall only deal 
with some of their basic properties. 


1. We begin with a definition: 
Derinition 5; .4 rectangular matrix 
Az=fia,|| (¢=1,2,...,m; k=1, 2,..., n) 
ts called totally non-negative (totally positive) if all its minors of any order 


are non-negatiwe (positive) : 


4° This theorem is sometimes called the asymptotic theorem for homogeneous Markov 
chains. See [4], pp. 479-82. 
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ae eee 
Al? eo" FS 
Sener ae, 


(4 <tg <oee <a 
(ls, ’ snjp=1,2, ..., min (m,n), 
ky<kg<---<k, 


In what follows we shall only consider square totally non-negative and 
totally positive matrices. 


Example 1. The generalized Vandermonde matriz 
V=|[arFr (Oca <ae<...< anja < a2<... < ap) 


is totally positive. Let us show first that | V | 540. Indeed, from | V|=0 
it would follow that we could determine real numbers cj, Ce, ..., Cn, not all 
equal to zero, such that the function 


f(z) ae) OE ed (a; Aa; for +4 7) 
=1 


has the n zeros 7,=a, ((=1, 2,..., ), where n is the number of terms in 
the above summand. For n=1 this is impossible. Let us make the induc- 
tion hypothesis that it is impossible for a sum of m, terms, where n; < n, 
and show that it is then also impossible for the given function f(z). Assume 
the contrary. Then by Rolle’s Theorem the function fi(2),=[x~"f(x)}’ 
consisting of m — 1 terms would have n —1 positive zeros, and this contra- 
dicts the induction hypothesis. 

Thus, | V| +0. But for a;=0, a2=1,...,a¢,=n—1 the determinant 
| V | goes over into the ordinary Vandermonde determinant | a*~*|?, which 
is positive. Since the transition from this to the generalized Vandermonde 
determinant can be carried out by means of a continuous change of the 
exponents aj, de,..., @, with presérvation of the inequalities a; < a2<... 
< a,, and since, by what we have shown, the determinant does not vanish 
in this process, we have | V | > 0 for arbitrary 0 < a1 < a2 <<... < ay. 

Sinee every minor of V can be regarded as the determinant of some gen- 
eralized Vandermonde matrix, all the minors of V are positive. 


Ezample 2. We consider a Jacobi matrix 


ra ae (113) 


ee e» e« © @ e e# e @ 
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in which all the elements are zero outside the main diagonal and the first 
super-diagonal and sub-diagonal. Let us set up a formule chat expresses an 
arbitrary minor of the matrix in terms of principal minors and the elements 
b,c. Suppose that 


pe itch, 


~kh<kj<-:- "<ky 
and 


=k, i, ks, ones = ky, > ty 41% ky 415 co ey ty, k,,3 ty 41 ky 44> ays ty, hy; oors 


then 
oe Sea “= ( sue *) (2) (i) tare *) 
J =J J oe f J mes, 114 
ie aS Fewcley hey 2 aa ee vee 
This formula is a consequence of the easily verifiable equation: 
ae *\=a(h fae Se 2) (for i, £k,). (115) 
ky wee kp ky oes Rana! \lpd \Rogy ++ bp ay 
From (114) it follows that every minor is the product of certain prin- 
cipal minors and certain elements of J. Thus: For J to be totally non- 


negative it is necessary and sufficient that all the principal minors and the 
elements b, c should be non-negative. 


2. A totally non-negative matrix A= | Qix [| always satisfjes the follow- 
ing important determinantal inequality :°° 


[2.8 1 2...) ptl..cn 
A <A . 
f oe ( ame © irae (p<). (116) 


Before deriving this inequality, we prove the following lemma: 
Lemma 0: If ira totally non-negative matriz A= | Aix | any prin- 
ceipai minor vanishes, then every principal minor ‘bordering’ it also vanishes. 


Proof. The lemma will be proved if we can show that for a totally non- 
negative matrix A= || ax ig it follows from 


woe -e 


°° See [172] and {17], pp. 111ff, where it is also shown that the equality sign in 
(118) can only hold in the following obvious cases: 

1) One of the factors on the right-hand side of (116) is zero; 

2) All the elements ai. (t= 1, 2,...,p; K=p4+l],...,n) ora (t= p4+1,...,%3 
K==1,2,..., p) are zero. 

The inequality (116) has the same outward form as the generalized Hadamard inequal- 
ity (ser (33), Vol. I, p. 255) for a positive-definite hermitian or quadratic form, 


§ 8. ToTaLLy NON-NEGATIVE MATRICES 101 


1 2...4q 
A =0 (¢q<n) (117) 
1 2...¢q 
that 
A; 52 )=0 i 
> 2m ete) 
For this purpose we consider two cases: 
1) @1,:=0. Since a1 | auaiy = 0, Quy = 0, a), 20 k= 2,..., 
m), either all the a, =0 (¢=2,..., ”) or all the a;,=0 (kK=2,..., n). 


These equations and a,;;—=0 imply (118). 
2) @11540. Then for some p (lS pq) 


1 2. 1 2... p—1 
4{ aoe 1)#0, 4( ed ?) a0. (119) 
1 2...p—l1 1 2...p—1 p 


We introduce bordered determinants 


a= Al, rae ) i k=p, ptt ) 120 
. 1 2...p—1 k (1,t=p,pt+l,...,” (120) 


and form from them a matrix D= | dix |\p- 
By Sylvester’s identity (Vol. I, Chapter II, § 3), 


] ) —]\ {9-1 wes = } , wee | ‘ 
2 A( p A(; 2 p—l %t % t, >0 (123) 
1 2...p—lI1 1 2...p—1 k, ky...k 
. <2 arr ) 
( <"! te 


9 
<n; =1,2,....n—p41), 
P by <back, : e 


so that D is a totally non-negative matrix. 
Since by (119) 


the matrix D falls under the case 1) and 


raat "—P ee 
D(? Ae Le 2...) i) Al; 2. "\=0. 
» n+1...%” 1 2...p-—1 12...% 
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... p-l 
Since A & : . oa) ~0, (118) follows, and the lemma is proved. 


3. We may now assume in the derivation of the inequality (116) that all 
the principal minors of A are different from zero, since by Lemma 5 one of 
the principal minors can only be zero when | A | =0, and in this case the 
inequality (116) is obvious. 

For n= 2, (116) can be verified immediately : 


1 2 
A i \= 211 gq — 21949) S O41 M9, 


since @12 = 0, do: = 0. Weshall establish (116) for n > 2 under the assump- 
tion that it is true for matrices of order less than n. Moreover, without loss 
of generality, we may assume that p > 1, since otherwise by reversing the 
numbering of the rows and columns we could interchange the roles of p 
and n — p. 

We now consider again the matrix D = || du jl, where the dy, (1,k =p, 
p+1,...,n) are defined by (120) ; we use Sylvester’s identity twice as well 
as the basic inequality (116) for matrices of order less than n and obtain: 


D(? Pars eee “ dppD(P 
_—_ \p pti... n} ptl...n 


a(’ ‘ae 2 
1 2...% 4(; 2... p— aaa 16 a oa i 
I bait 1 eee 
a(, ee 2...p—l1 cea 
Pn Eicken Oe ne ee ee Se ak 
4(; ees 
1 2... p—l 
1 2... +1... 
<4 P\al? ee (122) 
1 2... ptl.iin 


Thus, the inequality (116) has been established. 
Let us make the following definition : 


DEFINITION 6. A minor 


a(” te =e) Cie ee <| (123) 
sae hy Shy <eee<h, 


. the matriz A= || ay ||1 1 will be called almost principal if of the differences 
41 — hy, 12 — ho, ... , 4p — hey only one is not zero. 


§ 9. OscrLatToRY Matrices 103 


We can then point out that the whole derivation of (116) (and the proof 
of the auxiliary lemma) remain valid if the condition ‘A is totally non- 
negative’ is replaced by the weaker condition ‘all the principal and almost 
principal minors of A are non-negative.’ 


§ 9. Oscillatory Matrices 


1. The characteristic values and characteristic vectors of totally positive 
matrices have a number of remarkable properties. However, the class of 
totally positive matrices is not wide enough from the point of view of appli- 
eations to small oscillations of elastic systems. In this respect, the class of 
totally non-negative matrices is suffiently extensive. But the spectral 
properties we need do not hold for all totally non-negative matrices. Now 
there exists an intermediate class (between that of totally positive and that 
of totally non-negative matrices) in which the spectral properties of totally 
positive matrices are preserved and which is of sufficiently wide scope for 
the applications. The matrices of this intermediate class have been called 
‘oscillatory.’ The name is due to the fact that oscillatory matrices form the 
mathematical apparatus for the study of oscillatory properties of small vibra- 
tions of elastic systems.*” 


DEFINITION 7. A matrix A= || ay || ts called oscillatory if A is total” ; 
non-negative and tf there exists an integer q > 0 such that A?% is totally 
positive. 

Example. A Jacobi matrix J (see (113)) is oscillatory if and only if 
1. all the numbers J, c are positive and 2. the successive principal minors are 
positive: 


51 See [214]. We take this opportunity of mentioning that in the second edition of the 
book [17) by F. R. Gantmacher and M. G. Krein a mistake crept in which was first 
pointed out to the authors by D. M. Kotelyanskii. On p. 111 of that book an almost 
principal minor (123) was defined by the equation 


Pp 
SY i,—h|=1. 


9m 


With this definition, the inequaiity (116) does not follow from the fact that the principal 
and the alinost principal minors are non-negative. However, all the statements and proofs 
of § 6, Chapter II in [17] that refer to the fundamental inequality remain valid if an 
almost principal minor is defined as above and as we have done in the paper [214]. 


52 See [17], Introduction, Chapter ITI, and Chapter IV. 
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a, 6b 0... 0 O 


a, 0, O ¢, @ & ... 0 0 
a, b, 2 
a,>09, P a, >, |e, a 5:)>0,..-5 19 C fg --- 0 0 {>90. (124) 
Oy gl ie ie ew Re we a 
0 0 0 e+e Cy Ay | 


Necessity of 1, 2. The numbers }, c are non-negative, because J = O. 
But none of the numbers b, c may be zero, since otherwise the matrix would 
be reducible and then the inequality J? > O could not hold for any q > 0. 
Hence, all the numbers b, c are positive. All the principal minors of (124) 
are positive, by Lemma 5, since it follows from | J | = 0 and | J?7| > 0 that 
|J|>9. 

Sufficiency of 1., 2. When we expand | J | we easily see that the num- 
bers b, c occur in | J | only as products 6; ¢1, doce, ..., ba—1Cn—i. The same 
applies to every principal minor of ‘zero density,’ i.e., a minor formed from 
successive rows and columns (without gaps). But every principal minor of 
J is a product of principal minors of zero density. Therefore: Jn every prin- 
cipal minor of J the numbers b and c occur only as products bi cy, betes, ... 
ba—1€n—1. 

We now form the symmetrical Jacobi matrix 


Q, b, 0 
b, a, b 
Bo. 
j= soe » b&=Vbe>O (=1,2,...,n). (125) 
° bac, 
0 be * dy 


From the above properties of the principal minors of a Jacobi matrix it 
follows that the corresponding principal minors of J and J are equal. But 
then (124) means that the quadratic form 


J (2, x) 


is positive definite (see Vol. I, Chapter X, Theorem 3, p. 306). But in a 
positive-definite quadratic form all the principal minors are positive. There- 
fcre in J too all the principal minors are positive. Since by 1. all the numbers 
b, c are positive, by (114) all the minors of J are non-negative; i1.e., J is 
totally non-negative. 

That a totally non-negative matrix J for which 1. and 2. are satisfied is 
oscillatory follows immediately from the following criterion for an oscilla- 
tory matriz. 
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A totally non-negative matrix A= || ay, \\ is oscillatory if and only if: 
1) A is non-singular (| A| > 0); 


2) All the elements of A in the principal diagonal and the first super- 
diagonals and sub-diagonals are different from zero (a, > 0 for |i—k|1). 


The reader can find a proof of this proposition in (17], Chapter II, § 7. 


2. In order to formulate properties of the characteristic values and charac- 
teristic vectors of oscillatory matrices, we introduce some preliminary con- 
cepts and notations. 

We consider a vector (column) 


U= (U1, Ug, «+2, Un) 


Let us count the number of variations of sign in the sequence of coordinates 
Uy, U2, ..., Un Of u, attributing arbitrary signs to the zero coordinates (if 
any such exist). Depending on what signs we give to the zero coordinates 
the number of variations of sign will vary within certain limits. The 
maximal and minimal number of variations of sign so obtained will be de- 
noted by Sj and S7;, respectively. If S7 —Sj, we shall speak of the exact 
number of sign changes and denote it by 8,. Obviously S; = S+ if and only 
if 1. the extreme coordinates u, and u, of « are different from zero, and 
2.u,=—0 (1 <1< n) always implies that uj_;ui41 < 0. 
We shall now prove the following fundamental theorem: 


THEOREM 13: 1. An oscillatory matriz A= || aux ||? always has n dis- 
tinct pusitive characteristic values 


Ay > Ag > >A, > 0. (126) 


2. The characteristic vector “= (t11, Uo1, ..., Uni) Of A that belongs 
to the largest characteristic value 4, has only non-zero coordinates of like 


sign; the characteristic vector u= (19, Woo, ...5 Ung) that belongs to the 
second largest characteristic value ip has exactly one variation of sign in its 


coordinates ; more generally, the characteristic vector v= (Miz, Mon, .. +5 Unk) 
that belongs to the characteristic value A; has exactly k — 1 variations of sign 
(kK=1,2,...,n). 

3. For arbitrary real numbers c,, Cg41, ---, Co (lS 9Sh=n; 


DC > 0) the number of variations of sign in the coordinates of the vector 
k=g 

h 
u= S'c,u (127) 


k=mg 


lies between g—1 andh—1: 
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\ g—1SS,<S8t<h-1. (128) 


Proof. 1. We number the characteristic values 1,, do, ..., A, of A so 
that 


|Ay| = j4g|2°+-2'A,| 


and consider the p-th compound matrix UY, (p=1, 2Z,..., n) (see Chapter I, 
§ 4). The characteristic values of Y, are all the possible products of p 
characteristic values of A (see Vol. I, p. 75), 1.e., the products 


AyAg s+ Ay, Ay Ags? hy Anais 


From the conditions of the theorem it follows that for some integer q A? 
is totally positive. But then %, =O, U7 > 0; ie., N, is irreducible, non- 
negative, and primitive. Applying Frobenius’ theorem (see § 2, p. 40) 
to the primitive matrix %, (p=1, 2,..., ), we obtain 


Ayhgs +h > 9 (p=1, 2, skcegath).s 
Aydg set Ay > Aydg ss Apa Anas . (p=1,2,...,n—1). 


Hence (126) follows. 


2. From this inequality (126) it follows that A= || ax {ll is a matrix 
of simple structure. Then all the compound matrices WU, (g=1, 2,..., n) 
are alsb of simple structure (see Vol. I, p. 74). 

We consider the fundamental matrix U = 1 Ut ie of A (the k-th column 


of U contains the coordinates of the k-th characteristic vector u of A;k= 
1,2,...,n). Then (see Vol: I, Chapter III, p. 74), the characteristic veetor 
of %, belonging to the characteristic value A, 42... A, has the coordinates 


i ed, essiieeie | 
U j (lst <ig<es+<t, <n) (129) 
eee D 


By Frobenius’ theorem all the numbers (129) are different from zero 


and are of like sign. Multiplying the vectors ul, U, ..., & by +1, we can 
make all the minors of (129) positive: 


4, to... % 1V<t, Stigler Un 
u( : ?\0 sie aes (130) 
Lo 2 aap p=1,2,...,% 


rs 


53 The matrix a is the’ p-th compound matrix Aq (see Vol. I, Chapter I, p. 20.) 
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The fundamental matrix U = i ek || is connected with A bv the equation 


A=U {A,, dg, ...,4,} U7. (131) 


But then 
AtT=(U')7{A,, dg, .--,4,} U'. (132) 


Comparing (131) with (1382), we see that 
v=(U"y3 | (133) 


is the fundamental matrix of A’ with the same characteristic values 4,, ds, 
., A,» But sinee A is oscillatory, so is A‘. Therefore in V as well for 
every p= 1, 2,..., ” all the minors 


v(3 ) (lSiy<ig<-++<i,cn) (134) 
-- Dp 


are different from zero and are of the same sign. 
On the other hand, by (133) U and V are connected by the equation 


U'V=E. 


Going over to the p-th compound matrices (see Vol. J, Chapter I, § 4), we 
have: 


UB, =E,. 


Hence, in particular, noting that the diagonal elements of (, are 1, we obtain: 


u( de wre by v(* ee (135) 
ISip<igce+- <ipsn Lo 2 gpa T. De uD 


On the left-hand side of this equation, the first factor in each of the sum- 
mands is positive and the second factors are different from zero and are of 
hike sign. It is then obvious that the second factors as well are positive; 1.e., 


a ere ee Ista <as 
(3 ? "| 36 (ae "8 —— (136) 


1 2.. p=1,2,...,7 


Thus, the inequalities (130) and (136) hold for U=| Wik | and 
V = (U™)—? simultaneously. 
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When we express the minors of V in terms of those of the inverse matrix 
V-1!= UT by the well-known formulas (see Vol. I, pp. 21-22), we obtain 


p 
ae sf ; np+ 2% ; og F 
v(i a eee In—p \=3 1 2 P |, (137) 
12 ...%—-p [U| nn—1...n—pt+l 
where 4; tg <<... < tp and J, < jo<...< jn—p together give the com- 
plete system of indices 1, 2,...,”. Since, by (130), | U | > 0 it follows from 
(136) and (137) that 
np+ i, * 8 ‘ ° . ; ° < 
taal u(* aes ‘0 ers <i, Sn 


Be en ) (138) 


h h 
Now let u= > u ( >’ cf > 0). We shall show that the inequalities 
kenrg kus 


(130) imply the second part of (128): 
Si <h-1, (139) 
and the inequalities (138), the first part: 
S,2g—1. (140) 
Suppose that St >h—1. Then we can find h + 1 coordinates of u 


Ui Uiyy oo) Une, (LSI <ig<--- <b, <0) (141) 
such that 
ti, Yigg =O (a=1, 2,..., A). 
Furthermore, the coordinates (141) cannot all be zero; for then we could 
h 
equate the corresponding coordinates of the vector u= S'eu (.=...= 
A 


k=l 
P= 0% 2. cy > 0) to zero and thus obtain a system of homogeneous 
kewl 


equations 
h 
2 xtgp =O (a=1, 2,..., 2) 
kml 
with the non-zero solution ci, Ce, ..., Cr, whereas the determinant of the 
system io ; 
ty tg oo. 
U : 
eae 


is different from zero, by (130). 
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We now consider the vanishing determinant 


Wi yticcs Wh Unis 
We expand it with respect to the elements of the last column: 
b+ ee mee ee ere | 
mee | At+atiy U 1 a1 "a+ Atl — 9. 
~ ie Pre. WG ae Poca ees eee h 


But such an equation cannot hold, since on the left-hand side all the terms 
are of like sign and at least one term is different from zero. Hence the 
assumption that S;} > h—1 has led to a contradiction, and (139) can be 
regarded as proved. 

We consider the vector 


ke , 
UL = (Uyh> Wop +s Ung) (4=1, 2,..., ”), 


where 
he = (—1)* tt Fay (t, k=1, 2,..., 2”); 


then for the matrix U* = | 


uo (* to see ty >0 a (142) 
nmn—1...n—p+l ee p—1,2,....% 


ix ik we have, by (138): 


But the inequalities (142) are analogous to (130). Therefore, by setting 


A 
w= SS (Ito (143) 


ko 


—] 


we have the inequality analogous to (1.7 ™ 


Si.cn—g. (144) 
Let == (w1, Uo,.--,) Un) and u* = (uj, ue,..., Un). It is easy to see that 
u; = (—1)iu, (j= 1, 2,..., ). 


Therefore 


54 In the inequalities (142), the vectors & (k= 1, 2,..., %) oceur in the inverse order 


ae ... The vector % is preceded by n——g vectors of this kind. 
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Se +S =n—l, 


and so the relation (140) holds, by (144). P 

This establishes the inequality (128). Since the second statement of the 
theorem is obtained from (128) by setting g=h=k, the theorem is now 
completely proved. 


3. As an application of this theorem, let us study the small oscillations of 
Mm MASSES M1, Mo,..., mM, concentrated at nm movable points 7} << I2<... < In 
of a segmentary elastic continuum (a string cr a rod of finite length), 
stretched (in a state of equilibrium) along the segment 0=.2r/ of the 
2-axis. 

We denote by K(a,s) (02,51) the function of influence of this 
continuum (K(z2,s) is the displacement at the point x under the action of a 
unit force applied at the point s) and by k;; the coefficients of influence for 
the given n masses: 


k= K (z., x.) (t,j7=1, 2, ...,%). 
If at the points 21, x2,..., x, n forces Fi, Fo,..., F, are applied, then 


the corresponding static displacement y(z) (0 =z 1), is given, by virtue 
of the linear superposition of displacements, by the formula 


y(z)= 3K (x, 2) F,. 


3 
When we here replace the forces F; by the inertial forces — my Sey (2 t) 


(7=1, 2,..., 2), we obtain the equation of free oscillations 
nn a? 
y (x)= — 2 mK (x, %)) a5 Y (aj, #)- (145) 
?™= 


We shall seek harmonic oscillations of the continuum in the form 
y(x)=u(z)sin(wt+e) (OSe¢s)). (146) 
Here u(x) is the amplitude funetion, w the frequency, and a the initial 


phase. Substituting this expression for y(z) in (145) and caneelling 
sin (wt +a), we obtain 


u (2) = w? ~ mK (x, x;) u (2). (147) 
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Let us introduce a notation fur the variable displacements anid the dis- 
placements in amplitude at the po:nts of distribution of mass: 


y= (to, um=u(e) (=1,2,..., 2). 


Then 
y;=u,sin(wt+a) (¢=1,2,...,2). 


We also introduce the reduced anplitude displacements and the reduced 
coefficients of influence 


@,= Vmu,, ay = ¥m,m, ky (¢, j=1, 2,..., 2). (148) 


Replacing x in (147) by z; (¢=1, 2,..., 2) successively, we obtain a 
system of equations for the amplitude displacements: 


D> %UGjuj;= du, (a=s: i=1,2,...,n). (149) 


jal 


Hence it is clear that the amplitude vector w =(u,, Us, ..., &,) is a charac- 
teristic vector of A =|| a,; ||] =|| Vmm,k,; ||} for A = 1/w? (see Vol. I, Chapter 
X, § 8). 

It can be established, as the result of a detailed analysis,** that the matriz 
of the coefficients of influence || k,; ||t of a segmentary continuum ts always 
oscillatory. But then the matrix A= || a, ||? =|| Ymmk, ||? is also oscilla- 
tory! Therefore (by Theorem 13) A has n positive characteristic values 


Ay > dg > > A>; 


ie., there exist m harmonic oscillations of the continuum with dtstinct 
frequencies : 

] 

(0<)a,<wWe<--+: <a, (A= 5 


ate t—1,2: ot). 

By the same theorem to the fundamental frequency a; there correspond 
amplitude displacements different from zero and of like sign. Among the 
displacements in amplitude corresponding -to the first overtone with the 
frequency we there is exactly one variation of sign and, in general, among 
the displacements in amplitude for the overtone with the frequency w; there 
are exactly 7— 1 variations of sign (j=1, 2,..., 7”). 


55 See [239], [240], and [17], Chapter III. 
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From the fact that the matrix of the coefficients of influence || Key ia 
is oscillatory there follow other oscillatory properties of the continuum : 
1) For w=, the amplitude function u(x), which is connected with the 
amplitude displacements by (147), has no nodes ; and, in general, for w = a>; 
the function has j7 — 1 nodes (j= 1, 2,...,);2) The nodes of two adjacent 
harmonics alternate, etc. 

We cannot dwell here on the justification of these properties.®* 


66 See [17], Chapters III and IV. 


CHAPTER XIV 


APPLICATIONS OF THE THEORY OF MATRICES 
TO THE INVESTIGATION OF SYSTEMS OF 
LINEAR DIFFERENTIAL EQUATIONS 


§ 1. Systems of Linear Differential Equations with Variable 
Coefficients. General Concepts 


1. Suppose given a system of linear homogeneous differential equations of 
the first order: 


tt Sp (0) my (¢ =1, 2,...,%), @) 
kal 


where py(t) (t4,4=1, 2,...,) are complex functions of a real argument ¢, 
continuous in some interval, finite or infinite, of the variable ¢.? 
Setting P(t) = | Di(t) || and z= (x1, Ze, ..., 2), we write (1) as 


dz 
Ta Pie. (2) 
An integral matriz of the system (1) shall be defined as a square matrix 
A(H)= 1 y(t) || whose columns are 7 linearly independent solutions of 


the system. 
Since every column of X satisfies (2), the integral matrix X satisfies the 
equation 


dX 
qo POX. (3) 


In what follows, we shall consider the matrix equation (3) instead of 
the system (1). 

From the theorem on the existence and uniqueness of the solution of a 
system of differential equations? it follows that the integral matrix X(t) 
is uniquely determined when the value of the matrix for some (‘initial’) 


1 In this section, all the relations that involve functions of ¢ refer to the given interval. 


2 A proof of this theorem will be given in § 5. See also J. G. Petrowski (Petrovskii), 
Vorlesungen tiber die Theorie der gewéhnlichen Differentialgleichungen, Leipzig, 1954 
(translated from the Russian: Moscow, 1952). 
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value ¢ == t, is known,’? X(t.) =X. For X, we can take an arbitrary non 
singular square matrix of order n. In the particular case where X(t.) = £, 
the integral matrix X(#) will be called normalized. 

Let us differentiate the determinant of X by differentiating its rows in 
succession and let us then use the differential relations 


dx,; z — : 
qt = 2 Paty (i, 7=1,2,..-, 0). 
may eae = eae Pak 


We obtain: 
d|X] - a ae 
o1e =(Pyy + Pag + **° + Pan) |X]. 


Hence there follows the well-known Jacobi identity 


itr Pdt Bos 3 
|X|=ce* (4) 
where c is a constant and 
tr P= py1 + poo +... + Dan 
is the trace of P(t). 
Since the determinant { X | cannot vanish identically, we have c= 0. 


But then it follows from the Jacobi identity that | ¥ | is different from zero 
for every value of the argument 


| X | 0; 


Le., an integral matrix is non-singular for every value of the argument. 
If X(t) is a non-singular (| X(t) | 0) particular solution of (3), then 
the general solution is determined by the formula 


X= XC, . (5) 


where C is an arbitrary constant matrix. 
For, by multiplying both sides of the equation 


by C on the right, we see that the matrix XC also satisfies (3). On the other 
hand, if X is an arbitrary solution of (3), then (6) implies: 


3 It is assumed that t) belongs to the given interval of ft. 
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aX dig s i ~ 
Faqs: x1x)=% 


and hence by (3) 


ae 
dt (X1X)=0 
and 
X 1X = econst.=C ; 
ie., (5) holds. 
All the integral matrices X of the system (1) are obtained by the formula 
(5) with |C| 40. 
2. Let us consider the special case: 
dX 7) 
=, AX, .. ( 
where A is a constant matrix. Here X¥ =e“t is a particular non-singular 
solution of (7),* so that the general solution is of the form 


X =e4C . (8) 


where C is an arbitrary constant matrix. 
Setting t= t, in (8) we find: X, = e4%C. Hence C = e~4% X, and there- 
fore (8) can be represented in the form 


X = e4-WX. (9) 
This formula is equivalent to our earlier formula (46) of Chapter V (Vol. I, 


p. 118). 
Let us now consider the so-called Cauchy system: 


dX A 
at =j;——a~ 


(A is a constant matrix). | (10) 


This case reduces to the preceding one by a change of argument: 
a = In (t—a). 
Therefore the general solution of (10) looks as follows: 
X = e4nt-) C = (t—a)4C. (11) 


The functions e4* and (¢ — a)4 that occur in (8) and (11) may be repre- 
sented in the form (Vol. I, p. 117) 


Saree ee 
. ’ . basi k 

4By term-by-term differentiation of the series e4¢— 2 = tk we find s e4t — 4eAt. 

EanQ 
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& 
eft = DP (Za + Zyat +++ + Zam, te) OF, (12) 
k=l 
(t—a)4 = 3D" (Z,, + Zygln (t—a) ++°+ + Zim, {In (t — a)]"*—7) (é—a@)**. (13) 
kel 


Here 
p (A) = (A — Ag)™ (A — Ag) +9 (A—A,)™ 
(A; 4 Ay fori ~k;1,k=1,2,...,8) 


is the minimal polynomial} of A, and Z,; (j= 1, 2,..., me; k=1, 2,..., 8) 
are linearly independent constant matrices that are polynomials in A. 

Note. Sometimes an integral matrix of the system of differential equa- 
tions (1) is taken to be a matrix W in which the rows are linearly independ- 
ent solutions of the system. It is obvious that W is the transpose of X: 


W=X". 


When we go over to the transposed matrices on both sides of (3), we 
obtain instead of (3) the following equation for W: 


aw , 


Here W is the first factor on the right-hand side, not the second, as X was 
in (3). 


§ 2. Lyapunov Transformations 


1. Let us now assume that in the system (1) (and in the equation (3) ) 
the coefficient matrix P(t) = | pix(t) I) is a continuous bounded function 
of ¢ in the interval [t), 0 ).° 

In place of the unknown functions 2, z2, ..., 2, we introduce the new 
unknown functions 91, Y2,..., Yn bv means of the transformation 


Rn 
a, = 2 la (l) y, (s=1,2,...,n). (14) 


5 Every term Xx= (Zyq + Zyt + > °° + Zemyl™*—}) ett (Kk = 1, 2,..., 8) on the 
right-hand side of (12) is a solution of (7). For the product g(A)e4t, with an arbitrar) 
function g(A), satisfies this equation. But X, =f (4A) ==g(A)e4t if f(A) = g(A) e4tanc 
g(A*) =1, and all the remaining m—1 values of g(\) on the spectrum of A are zert 
(see Vol. I, Chapter V, formula (17), on p. 104). 

6 This means that each function pix(t) (4, k= 1, 2,..., n) is continuous and boundec 
in the interval [to, 00), i.e., ¢= te. 7 * 
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We impose the following restrictions on the matrix L(t) = | li (t) \)3 
of the transformation: 


‘ . .. @L. : 
1. L(t) has a continuous derivative - in the interval [f), 0); 
2. L(t) and = are bounded in the interval [t), ©) ; 


3. There exists a constant m such that 
0< mM < absolute value of | Z(¢) | (t=), 


i.e., the determinant | Z(¢) | is bounded in modulus from below by the posi- 
tive constant m. 


A transformation (14) in which the coefficient matrix L(t) = || l(t) |i 
satisfies 1.-3. will be called a Lyapunov tramsformation and the correspond- 
ing matrix L(t) a Lyapunov matric. 

Such transformations were investigated by A. M. Lyapunov in his 
famous memoir ‘The General Problem of Stability of Motion’ [32]. 

Examples. 1. If L=const. and | L | #0, then L satisfies the conditions 
1.-3. Therefore a non-singular transformation with constant coefficients is 
always a Lyapunov transformation. 


2. If D=|| dy ||? is a matrix of simple structure with pure imaginary 
characteristic values, then the matrix 
L(t) =e 


satisfies the conditions 1.-3. and is therefore a Lyapunov matrix.” 


2. It is easy to verify that the conditions 1.-3. of a matrix L(t) imply the 
existence of the inverse matrix L—'(t) also satisfying the conditions 1.-3. ; 
ie., the inverse of a Lyapunov transformation is itself a Lyapunov trans- 
formation. In the same way it can be verified that two Lyapunov transfor- 
mations in succession yield a Lyapunov transformation. Thus, the Lyapunov 
transformations form agroup. They have the following important property : 


If under the transformation (14) the system (1) goes over into 


d n 
HD Galt) Ye (15) 
k=1 


und uf the zero solution of this system ws stable, asymptotically stable, or 
unstable in the sense of Lyapunov (see Vol. I, Chapter V, § 6), then the zero 
olution of the original system (1) has the same property. 


7 Here all the m,=1 in (12) and As = tz ( 9, real, F=1, 2,..., 8). 
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In other words, Lyapunov transformations do not alter the character 
of the zero solution (as regards stability). This is the reason why these 
transformations can be used in the investigation of stability in order to 
simplify the original system of equations. 

A Lyapunov transformation establishes a one-to-one correspondence be- 
tween the solutions of the systems (1) and (15) ; moreover, linearly inde- 
pendent solutions remain so after the transformation. Therefore a Lyapunov 
transformation carries an integral matrix X of (1) into some integral 
matrix Y of (15) such that 


X=L(tHyY. (16) 


In matrix notation, the system (15) has the form 
adY 
qe yY, (17) 


where Q(t) = | gix(t) || is the coefficient matrix of (15). 

Substituting LY for X in (3) and comparing the equation so obtained 
with (17), we easily find the following formula which expresses Q in terms 
of P and L: 


Q=1PL—1A (18) 


Two systems (1) and (15) or, what is the same, (3) and (17) will be 
called equivalent (in the sense of Lyapunov) if they can be earried into one 
another by a Lyapunoy transformation. The coefficient matrices P and Q 
of equivalent systems are always connected by the formula (18) in which 
L satisfies the conditions 1.-3. 


§ 3. Reducible Systems 


1. Among the systems of linear differential equations of the first order the 
simplest and best known are those with constant coefficients. It is, there- 
fore, of interest to study systems that can be carried by a Lyapunov trans- 
formation into systems with constant coefficients. Lyapunov has ealled such 
systems reducible. 

Suppose given a reducible system 


dX 
Fe = PX: (19, 
Then some Lyapunov transformation 
X=L(t)Y (20) 


carries it into a system 
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-__ =AY, eo Shed A ES 3 ae (21) 


where A is a constant matrix. Therefore (19) has the particular solution 
X=L (fe. (22) 


It is easy to see that, conversely, every sys‘em (19) with a particular solu- 
tion of the form (22), where L(t) is a Lyapunov matrix and A a.constant 
matrix, is reducible and is reduced to the form (21) by means of the Lyapu- 
nov transformation (20). 

Following Lyapunov, we shall show that: Every system (19) with 
periodic coeffurents ts reducible.® 

Let P(t} in (19) be a continuous function in (—- o, + o) with pemdd rt: 


P(t+1)=P(d). (23) 


Replacing ¢ in (19) by t+ rand using (23), we obtain: 


ox (4+) — Pi) X(t +2). 


Thus, X(¢ +17) is an integral matrix of (19) if X(¢) is. Therefore 
X(t+ D=X()V, 


where V is a constant non-singular matrix. Since | V|540, we can 
determine® 


t t 
— —InV 


This matrix function of ¢, just like X(t), is multiplied on the right by V 
when the argument is increased by tr. Therefore the ‘quotient’ 


t 
: —IlnV 


L(#)=X()V * =X(the * 
is.continuous and periodie with period rt: 
L(t+r)=L(t), 


and with |Z|0. The matrix L(¢) satisfies the conditions 1.-3. of the 
preceding section and is therefore a Lyapunov matrix. 


8 See [32], § 47. 

9Here nV=/f(V), where f(A) is any single-valued branch of InA in the simply- 
connected domain G containing all the characteristic values of V, but not containing 0. 
See Vol. I, Chapter V. 
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On the other hand, since the solution XY of (19) can be represented in 


the form 
Inv 


X=Litle™' 


the system (19) is reducible. 
In this ease the Lyapunov transformation 


X=L(Y, 


which earries (19) into the form 


has periodic coefficients with period t. 

Lyapunov has established’® a very important criterion for stability and 
instability of a first linear approximation to a non-linear system of differ- 
ential equations 


Ge tute + (08) (6 =1, 2,...,n), (24) 


where we have convergent power series in 21, Zo, ..., Z, on the right-hand 
side and where (**) denotes the sum of the terms of second and higher orders 
in 2%, Z2,..., Xn; the coefficients a, (1,k =1, 2,..., ) of the linear terms 
are constant.” 


LYaPUNOV’s CRITERION: The zero solution of (24) 1s stable (and even 
asymptotically stable) if all the characteristic values of the coefficient matriz 
A= | Ax || of the ferst linear approximation have negative real parts, and 
unstable if at least one characteristic value has a positive real part. 


2. The arguments used above enable us to apply this criterion to a system 
whose linear terms have periodic coefficients : 


“ = 2) pay (t) & + (#4). (25) 
k=l 


For on the basis of the preceding arguments we reduce the system (25) to 
the form (24) by means of a Lyapunov transformation, where 


10 See [32], § 24. : \ 
11 The coefficients in the non-linear terms may depend on t. These functional coeffi 
ejents are subject to certain restrictions (see [32], § 11). 
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l 
A =|[@e [= ln V 


and where V is the constant matrix by which an integral matrix of the cor- 
responding linear system (19) is multiplied when the argument is changed 
by rt. Without loss of generality, we may assume that tr > 0. By the prop- 
erties of Lyapunov transformations the zero solutions of the original and of 
the transformed systems are simultaneously stable, asymptotically stable, 
or unstable. But the characteristic values 4, and y% (t+=1, 2,...,) of A 
and V are connected by the formula 


1 : 
‘= ln», (s=1, 2,...,%). 


Therefore, by applying Lyapunov’s criterion to the reduced systems we 
find :?? 

The zero solution of (25) ts asymptotically stable tf all the characteristic 
values v1, vo, ..., ¥n Of V are of modulus less than 1 and unstable if at least 
one characteristic value ts of modulus greater than 1. 


Lyapunov has established his criterion for the stability of a linear ap- 
proximation for a considerably wider class of systems, namely those of the 
form (24) in which the linear approximation is not necessarily a system with 
constant coefficients, but belongs to a class of systems that he has called 
regular.** 

The class of regular linear systems contains all the reducible systems. 

A criterion for instability in the case when the first linear approxima- 
tion is a regular system was set up by N. G. Chetaev.'* 


§ 4. The Canonical Form of a Reducible System. Erugin’s Theorem 


1.. Suppose that a reducible system (19) and an equivalent system 


(in the sense of Lyapunov) are giver, where A is a constant matrix. 

We shall be interested in the question: To what extent is the matriz A 
determined by the given system (19)? This question can also be formu- 
lated as follows: 


12 Loc. cit., § 55. 
13 Loe. cit., § 9. 
14 See [9], p. 181. 
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When are two systems 


where A and B are constant matrices, equivalent in the sense of Lyapunov; 
t.e., when can they be carried into one another oy a papi noe transfor- 
mation ? 

In order to answer this question we introduce the notion of matrices with 
one and the same real part of the spectrum. 

We shall say that two matrices A and B of order n have one and the same 
real part of the spectrum if and only if the elementary divisors of A and B 
are of the form 


(A—A,)™, (A-— Ag)™, «5 (A—A,)™; (A— ay )™, (A— oe), ©.) (Ap), 


where 
RedA,=Reu, (k=1, 2,..., 8). 
Then the following theorem due to N. P. Erugin holds :** 


THEOREM 1 (Erugin): Two systems 
Vay and “4 =z (26) 


(A and B are constant matrices of order n) are equivalent in the sense of 
Lyapunov tf and only if the matrices A and B have one and the same real 
part of the spectrum. 

Proof. Suppose that the systems (26) are given. We reduce A to the 
normal Jordan form’® (see Vol. I, Chapter VI, § 7) 


A=T (A, FE, + Hy, AE, + Hy, ..., 4#,+ H,} T, (27) 
where 
Ay = ay + 2B; (az, 8, are real numbers; k =1, 2,...,8). (28) 
In accordance with (27) and (28) we set 
A, =T (0,2, + Hy, H+ Ho, ..., %H, + H,} mt 
A,= T { 1B,E,, 1P,E,, ves 1p,E,} tT. 


18 Our proof of the theorem differs from that of Erugin. 


(29) 


16 #, is the unit matrix; in H, the elements of the first superdiagonal are 1, and the 
remaining elements are zero; the orders of Ex, Hx are the degrees of the k-th elementary 
divisor of A, i.e., m (K—=1, 2,..., 8). 
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Then 
A =A, + Ag, A,A, = A, A). (30) 


, We define a matrix L(t) by the equation 
L (t) =e4s, 


L(t) is a Lyapunov matrix (see Example 2 on p. 117). 
But by (30) a particular solution of the first of the systems (26) is of 
the form 
eA! = eAste Art = F(t) e418), 


Hence it follows that the first of the systems (26) is equivalent to 
dU 
= AU, (31) 


where, by (29), the mat~?= A, has real characteristic values and its spec- 
trum coincides with the real part of the spectrum of A. 

Similarly, we replace the second of the systems (26) by the equivalent 
system | | | | 


aV 


where the matrix B, has real characteristic values and its spectrum coincides 
"ith the real part of the spectrum of B. 

Our theorem will be proved if we can show that the two systems (31) 
end (32) in which A, and B, are constant matrices with real characteristic 
values are equivalent if and only if A, and B, are similar.” 

Suppose that the Lyapunov transformation 


U=L,V 
carries (31) into (32). Then the matrix Z, satisfies the equation 


dL 

“at = AL, — L,B,. (33) 
This matrix equation for LZ, is equivalent to a system of n? differential 
equations in the n? elements of Z,. The right-hand side of (33) is a linear 


operation on the ‘vector’ Z, in an n-dimensional space 


17 This proposition implies Theorem 1, since the equivalence of the systems (31) and 
(32) means that the systems (26) are equivalent, and the similarity of 4: and B: means 
that these matrices have the same clementary divisors, so that the matrices 4 and B have 
one and the same real part of the spectrum. 


\ 
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~~ 
m7 =F(L,), (F(L,) =4,L,—1,B,). (33°) 


Every characteristic value of the linear operator F (and of the corre- 
sponding matrix of order n*) can be represented in the form of a difference 
y — 6, where y is a characteristic value of A, and 6 a characteristic value 
of B,..% Hence it follows that the operator F has only real characteristic 
values. 

We denote by 


9 (a= A—AY™ (4—4,)™ + (A — 


(the A: are real; i, 5 a 4, fori~j;1,j=1, 2,. u) the minima! polynomial 
of F. Then the solution Z;(t) == FIO) of '(38") ean, by formula (12) 
(p. 116), be written as follows: 


L,()= > 5 Lyt &, (34) 


kml jul 
where the I; are constant matrices of order 7. Since the matrix [,(t) is 
bounded in the interval (t¢), ©), both for every ax > 0 and for 2,=0 and 


j > 0, the corresponding matrices Li; = O. We denote by L_(t) the sum of 
ait the terms in (34) for which a, <0. Then 


L, ()= L_(t)+ Ly, (35) 
where 
lim L_()=0, lim 29 =0, Ly=const. (35’) 
t—> +00 + 00 
Then, by (35) and (35’), 
‘lim L, (t)=Lp, 


&—> + co 
18 For let Ay be e any characteristic value of the operator F. Then there exists a matrix 
LO such that F(L) == Aol, or 
(4: — AoE) L= LB. (*) 


The matrices 4:— 4.F and B; have at least one characteristic value in common, since 
otherwise there would exist a polynomial g(4) such that 


g(A1 — AoE) =O, 9 (Bi) = E, 


and this is impossible, because it follows from (*) that g(41— 40H) * L=L-* g(B,) 
‘and D0. Butif 4: — A4cH and B, have a common characteristic value, then 46.= »y— 6, 
where y» and 45 are characteristic values of 4: and B,, respectively. A detailed study of 
the operator F can be found in the paper [179] by F. Golubchikov . 
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from which it follows that 


| Lo | 9, 


because the determinant | Z,(¢) | is bounded in modulus from below. 
When we substitute for Z,(¢) in (33) the sum L_(t) + Lo, we obtain: 


dL_(t) 


—O — AL (t) + BL_(t) =ApLp— Bly; 


hence by (35’) 
A,Ly cama LB, = O 
and therefore | 
B,= Ly AL. (36) 


Conversely, if (36) holds, then the Lyapunov transformation 
U=LV 


carries (31) into (32). This completes the proof of the theorem. 
2. From this theorem it follows that: Every reducible system (19) can be 
carried by the Lyapunov transformation X =LY inio the form 


dY 
a =, 


where J is a Jordan matriz with real characteristic values. This canonical 
form of the system is uniquely determined by the given matrix P(t) to 
within the order of the diagonal blocks of J. 


§5. The Matricant 


1. We consider a system of differential equations 


aX 
= PW) x, (37) 


where P(t) = 1 Dix (t) || is 2 continuous matrix function of the argument 
t in some interval (a, 6).?® 


19 (a,b) is an arbitrary interval (finite or infinite). All the elements pu(t) (i, k= 
1, 2,...,%) of P(t) are complex functions of the real argument ¢, continuous in (a, pb). 
Everything that follows remains valid if, instead of continuity, we require (in every finite 
subinterval of (a,0)) only boundedness and Riemann integrability of all. the functions 


(t). 
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We use the method of successive approximations to determine a normal- 
ized solution of (37), i.e., a solution that for t =f, becomes the unit matrix 
(t) is a fixed number of the interval (a,b) ). The successive approximations 
X;, (k=0.1, 2,...) are found from the recurrence relations 


aX 
“= Pt) Xe (k=1, 2, ...), 


when X, is taken to be the unit matrix E. 
Setting X,(f)) =E (k=0, 1, 2,...) we may represent X, in the form 


f 
X,=E + P(t) X,_, 4. 
by 


Thus 


t ¢ é t 
X= EB, X,=E+{P(r)de, X,=E+/ P(r)dv+f P(t) { P(o)dodr,..., 
fo te te f 


ie., X;, (k=0, 1, 2, ...) is the sum of the first k + 1 terms of the matrix 
series 


t t t 
E+{P(t)dt+{P(t){ P(o)dodt+---. (38) 
b fo to 


In order to prove that this series is absolutely and uniformly eo.. ergent 
in every closed subinterval of the interval (a, b) and determines the required 
solution of (37), we construct a majorant. 

We define non-negative functions g(t) and h(t) in (a,b) by the equa- 
tions”° 


g (t)= max (| py (|, | Pra (t) |, ---+ | Pan (|, 4 (2) =| fg (x) dr]. 


It is easy to verify that g(t), and consequently h(t) as well, is continuous 
in (a, b).7? 

Each of the n? scalar series into which the matrix series (38) splits is 
majorized by the series 


14h) +O, MO LO... (39) 


20 By definition, the value of g(t) for any value of t is the largest of the n* moduli of 
the values of pie(t) (i, k= 1, 2,..., 7) for that value of f. 

21 The continuity of g(t) at any point ¢: of the interval (a,b) follows from the fact 
that the difference g(t) — g(t) for ¢ sufficiently near t: always coincides with one of 
the n? differences | pie (t) | —-| pee (ts) | (4, H—=1, 2,..., 2). 
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For 


t t 
\(fPadt)e|=! frutedr|s| fgtyde|=acn, 
) to 


fo 


¢ t 
| ({P{r) [P(o)dodt)is 
to to 


n t t r 2 
=| > |Pa(t) [Pu lo)dode: Zn Ja(t) (a) dode|= =), 
I=1t te t : 


ete. 

The series (39) converges in (a,b) and converges uniformly in every 
closed part of this interval. Hence it follows that the matrix series (387 also 
converges in (a,b) and does so absolutely and uniformly in every closed 
interval contained in (a,b). 

By term-by-term differentiation we verify that the sum of (38) is a 
solution of (37); this solution becomes EF for t= ¢,. The term-by-term 
differentiation of (38) is permissible, because the series obtained after dif- 
ferentiation differs from (38) by the factor P and thvrrefore, like (38). is 
uniformly convergent in every closed interval contained in (a, b). 

Thus we have proved the theorem on the existence of a normal sclution 
of (37). This solution will be denoted by 2),(P) or simply Q). Every 
other solution, as we have shown in § 1, is of the form 


X= D0, 


where C is an arbitrary constant matrix. From this formula it follows that 
every solution, in particular the normalized one, is uniquely determined by 
its value for ¢ = tp. 

This normalized solution 2), of (37) is often called the matrecant. 

We have seen that the matricant can be represented in the form of a 
series?” 


é t r 
Q,=E +] P(t)de +f P(t) [ P(o)dodr+---, (40) 
te fo te 


which converges absolutely and uniformly in every closed interval in which 
P(t) is continuous. 


2. We mention a few formulas involving the matricant. 


1. Q),= Q,Qi (toy ty, t € (a, b)). 


For since Q, and 2), are two solutions of (37), we have 


22 The representation of the matricant in the form of such a series was first obtained 
by Peano [308]. 
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Qi, =Q;,0 (C is a constant matrix). 
Setting t= ?, in this equation, we obtain C = 27. 
2 OF (P+Q)=Qi,(P)Qi,(8) with S=[Q,(P)I- QQ, (P). 
To derive this formula we set: 
X=Q,(P), Y=2,(P+Q), 
Y=XZ. (41) 


and 


Differentiating (41) term by term, we find: 
aZ 
(P+Q)XZ=—PXZ+X—. 
Hence 
dZ ig _ 
zx 1QXZ 
and since it follows from (41) that Z(t.) =£, 


Z =, (X—QX). 


When we substitute their respective matricants for X, Y, Z in (41), we 
obtain the formula 2. 


$ 
3. In | 2;,(P)| =f tr Pde. 
be 


This formula follows from the Jacobi identity (4) (p. 114) when we 
substitute 21,(P) for X(t) in that identity. 


4. If A= || a ||1 =const., then 
DQ, (A) = e4(t—te), 


We introduce the following notation. If P= || px ||?» then we shall 


mean by mod P the matrix 


mod P= |! |pu. | ||f. 
furthermore, if A= || au ||? and B= || by, || are two real matrices and 
Ay = be (t,4=1,2,...,n), 


we shall write 
¢pe2 ASB. 
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Then it follows from the representation (40) that: 


5. If mod P(t) =Smod Q(t) (tty), then the series (40) for Q:(P) 
is majorized, beginning with the first term, by the same series for Q:(Q), 
so that for allt = ty 


mod 2%, (P) < 2,(Q), mod [Qf (P) — #] < 21,(Q}—E, 
t 


t 
mod [.2;,(P) — E—[ Pdr] <01,(Q2) -E—[Qdr, ete. 
& te 


In what follows we shall denote the matrix of order m in which all the 
elements are 1 by I: 
F=|1|. 


We consider the function g(t) defined on p 126. Then we have 
mod P(t) = g(t)I. 


But 2(g(t)I) is the normalized solution of the equation 


dX 
<< ==9 () 1X. 
Therefore, by 4.,?° 
Ag (QD=M= E+ (ht 4 TO y-\r, (42) 


where 
1 
h(t)={g(x)dr, g() = max | py (t)|. 
& 154, kan 
Therefore it follows from 5. and (42) that: 


6. mod 1%, (P) < H+ — (e™—1)1, 
mod [,(P) — E] < —(e™ —1)1, 


| 
mod [%, (P) —B—[ Pdr] <—(e"™—1—mh())1, ete. 
te 


We shall now derive an important formula giving an estimate for the 
modulus of the difference between two matricants : 


t 
23 By replacing the independent variable ¢ by h= f g(t)dt. 
t 
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7. mod [Q, (P) — 2, (@)] < — entity) (enMt-6)_1)T (tt), 


of 
modQ@=qI, mod (P—@Q) =da-l, r= | 1 


(q, d are non-negative numbers; ~ is the order of P and Q). 
We denote the difference P—@Q by D. Ther 


P=Q-+D, mod D=d°l. 
Using the expansion (40) of the matricant in a series, we find: 


GQ + D) — ~ 2) 
=| Deedes. [D(0 [Ole)dade + faley[ Diodede +, [D(=)[Dio)daae + 


From this expression it is clear that, for t = to, 


mod [:2, (Q + D) — 2%, (Q)] S M1, (mod Q + mod D) — Q, (mod Q) 
<4, ((q + 4) I) — D4, (QI) = elt 96-0) — eat etn 
— et (tt) (et ¢—%) — £) 
== [E+ [emer — 1) 
1 1 
== E ++ (erst) — 1) Br] (e#@—4) 1) 


— 1 gnat—4) (end) 1) 7. 
= | 


We shall now show how to express by means of the matricant the general 
solution of a system of linear differential equations with right-hand sides: 


“>> Dix (t) 2 + f, (0) (¢=1, 2,..., 2); (43) 


pu(t) and fi(t) (4, k=1, 2,..., ”) are continuous functions of ¢t in some 


interval. 
By introducing the column matrices (‘vectors’) x = (21, Z2,..., %,) and 


f= (fi fe,..-,f,) and the square matrix P = 1 Pik {| , we write the system 


as follows: 


f= Pet f(t). (43’ 
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We shall look for a solution of this equation in the form 
z= Q1,(P)2, (44) 


where 2 is aa unknown column depending on ¢. We substitute this expres- 
sion for z in (43’) and obtain: 


PQ (P) 2+ Q,(P)F = PO, (P)z +t (0; 


hence 


dz t 1 
& — [0% (Pf (0. 


Integrating this, we find : 
| 
=f [Q,(P)I“f (x) dr +e, 
to 
where c is an arbitrary constant vector. Substituting this expression in 
(44), we obtain: 
Ca t 
x= 2,(P) { (2:,(P)I-f (x) de + ,(P)c. (45) 
bo 


When we give to ¢ the value fo, we find: x(t.) =c. Therefore (45) assumes 
the form 


x = Q,(P) x(t) + { K (t,t) f(t) de, (45’) 


where 
K (t,t) = Q,, (P) (92, (P)}> 


is the so-called Cauchy matrix. 


§ 6. The Multiplicative Integral. The Infinitesimal Calculus 
of Volterra 


1, Let us consider the matricant Q)(P). We divide the hasic interval 
(t,,¢) into » parts by introducing intermediate points f;, fa. .... fx—y and 
Set At, =t,—ty1 (A=1, 2,..., 2; t,=t). Then by property 1. of the 
MImatricant (see the preceding section), 


Qi, = Qin 12 QEQE. . (46) 
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In the interval (t,_,, t;,,) we choose an intermediate point t (k =1, 2,...,n). 
By regarding the 4¢, as small quantities of the first order we can take, for 


the computation of 9,F _, to within small quantities of the second order, 


P(t) ~ const.=P(t,). Then 
Qij_, =eP He + (wn) = + P(x,) Aty + (##); (47) 


here we denote by the symbol (##) the sum of terms beginning with terms 
of the second order. 
From (46) and (47) we find: 


Qi, =eP (nl din oo. GP (Ate P(t) 4h + (@) (48) 
and 


Qi, =[(E + P(r,) At] ++: (H+ P(x) Sts] (2+ P(v;) At) + (*). (49) 


When we pass to the limit by increasing the number of intervals indefi- 
nitely and letting the length of these intervals tend to zero (the small terms 
(*) disappear in the limit),?4 we obtain the exact limit formulas 


Q),(P) = lim [e? Gn) 4tn oe eP (¥2) Abs e P (a1) 44} (48°) 
and . Atz->0 


Qi,(P) = lima [Hf + P(t) Aty) +++ [2 + P(t) tg] (B+ P(e) Ab]. (49° 


The expression under the limit sign on the right-hand side of the latter 
equation is the product integral.?5 We shall call its limit the multiplicative 
integral and denote it by the symbol 


ip [2 + P(t) dt] = lim (HF + P(z,) At,] +++ [B+ P(x,) At]. (50) 
At~-0 . 


The formula (49’) gives a representation of the matricant in the form of a 
multiplicative integral 


G,(P)= ft (#+P as), (61) 


and the formulas (48) and (49) may be used for the approximative compu- 
tation of the matricant. 


“4 These arguments can be made more precise by an estimate of the terms we have 
denoted by (*). For a rigorous deduction of (48’) we have to use formula 7. of § 5 in 
which the matricant 9(t) must be replaced by a piece-wise constant matrix 


Q(t) = P(te) (eo St Stk; Kk=1,2,...,2). 


25 An analogue to the sum integral for the ordinary integral. 
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The multiplicative integral was first introduced by Volterra in 1887. 
On the basis of this concept Volterra developed an original infinitesimal 
calculus for matrix functions (see [63] ).?¢ 

The whole peculiarity of the multiplicative integral is tied up with the 
fact that the various values of the matrix function P(?t) in subintervals are 
not permutable. In the very special case when all these values are permutable 


P(t’) P(t’) =P(t’) P(t’) (t’, t”” € (to, £)), 


the multiplicative integral, as is clear from (48’) and (51), reduces to the 
matrix 


| 
[ Pod 


e” 


2. We now introduce the multiplicative derivative 


DX =" x4. (52) 
The operations D; and if are mutually inverse: 
If ° 
DX =P, 
then?" 


x=f' (B+ Pd)-C (C=X(h)), - 


and vice versa. The last formula can also be written as follows: 2° 
fi. (B+ Pat) =X (t)X (t)-.. (53) 


We leave it to the reader to verify the following differential and integral 
formulas :?° 


26 The multiplicative integral (in German, Produkt-Integral) was used by Schlesinger 
in investigating systems of linear differential equations with analytic coefficients [49] 
and [50]; see also [321]. 

The multiplicative integral (50) exists not only for a function P(t) that is continuous 
in the interval of integration, but also under considerably more general conditions 
(see [116}). 

27 Here the urbitrary constant matrix C is an analogue to the arbitrary additive con- 
stant in the ordinary indefinite integral. 


aX 
28 An analogue to the formula f Pdt = X(t) — X(t), where zx FP; 
Ce 

29 These formulas can be deduced immediately from the definitions of the multiplica- 
tive derivative and multiplicative integral (see [63]). However, the integral formulas are 
obtained more quickly and simply if the multiplicative integral is regarded as a matricant 
and the properties of the matricant that were expounded in the preceding section are used 
(see [49]). 
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DIFFERENTIAL FORMULAS 


I. D.(XY) =D,(X) + XD,(¥) X=, 
D,(XC) =D,(X), 
D,(CY) =CD,(Y) C—. 

Il. D,(X7) =X" (D,X)" XT. 
IM. D,(X-) =— X“D, (X) X =—(D, (X"))", 
D,((X*)7) = — (D, (X))". 


(C is a constant matrix) 


INTEGRAL FORMULAS 
IV. |i, (B+ Pdr) =f' (B+ Pdr) [¢(B + Par). 
V. fi (B+ Par)=[[" (B+ Par). 
VI. {' (8 + CPC dx) =0 f' (D+ Pdr) C— (C is a constant matrix) 
VIL. ft, [B+ (Q+ DX) de]= X (t) ft (B+ XAQX de) X (th). 


VII. mod| f‘ (E+ Pdr) = (E+Qdr)|< — eng (tte) (end (@—) _ 1) T (>t 


if 
mod QSq°l, mod(P—Q)SaI, I=|1|| 


(q and d are non-negative numbers ; » is the order of P and Q). 
Suppose now that the matrices P and Q depend on the same parameter a 


P=P(t,a), Q=Q(t,@) 
and that 
lim P (t, a) = lim Q (t, a2) = Py (t), 
Are . L>Ke 
where the limit is approached uniformly with respect to tin the interval 
(t.,¢) in question. Furthermore, let us assume that for a— a, the matrix 
Q(t, a) is bounded in modulus by qi, where g is a positive constant. Then, 
setting 
lim d(a)=0, 
a>, 
we have: oe 
d(a)= max | py (t, «) — gu (t,@) |. 
lsi,ksn 
& ats 
31 The formula VII can be regarded in a certain sense as a analogue to the formula for 


integration by parts in ordinary (non-multiplicative) integrals. VII follows from 2. 
of §5). 
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Therefore it follows from formula VIII that: 
ae goo 0 ee 
lim (ff, (+ Pan —J), (e+ Qae)]=0. 
In particular, if Q does not depend on a (Q(t, a) = Po(t)), we obtain: 
lim f) [H+ P(t,a)de]=[) [B+ Py (t) de), 
- . 


where : 
P, (t) = lim P (t,@). 


A—p Xe 


§ 7. Differential Systems in a Complex Domain. General Properties 


1. We consider a system of differential equations 


d : . on 
a = > Pu (z) x. (54) 
k=l 
Here the given function p,(z) and the unknown functions 2,(z) (k= 
1,2,..., 2) are supposed to be single-valued analytic functions of a complex 
argument z, regular in a domain G of the complex z-plane. 
Introducing the square matrix P(z) = 1 Dix (2) 2 and the column matrix 


x= (21, Le,..., Ln), we Gan write the system (54), as in the case of a real 
argument (§ 1), in the form 


= =P(z)z (54’) 


Denoting an integral matrix, i.e., a matrix whose columns are n linearly 
independent solutions of (54), by X, we can write instead of (54’) : 


dX 
z= P (z) X (55) 
Jacobi’s formula holds also for a complex argument z: 


fr Pds 
| ».¢ | = ce* (56 ) 


& 
Here it is assumed that 2 and all the points of the path along which / 18 
Le 


taken are regular points for the single-valued analytic function tr P(z) = 
pir (2) + po2(%) +... + Dam(2).°? 


—_—————e 


32 Here, and in what follows, the path of integration is taken as a sectionally smooth 


curve. 
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2. A peculiar feature of the case of a complex argument is the fact that for 
a single-valved function P(z) the integral matrix X (z) may well be a many- 
valued funeticn of z. 
As an example, we consider the Cauchy system 
ax U 


ae = —,4 (U is a constant matrix). (57) 


One of the solutions of this system, as in the case of a real argument (see 
p. 115), is the integral matrix 


X =e7 ne) —(z—a)”. (58) 


For the aomain G we take the whole z-plane except the point z=a. All the 
points of this domain are regular points of the coefficient matrix 


U 


2—@ 


P(z)= ° 
If U 40, then z=a is a singular point (a pole of the first order) of the 
matrix function P(z) = U/(z— a). 

An element: of the integral matrix (58) after going around the point 
z= aonce in the positive direction returns with a new value which is obtained 
from the old one by multiplication on the right by the constant matrix 


V = etx 


In the general case of a system (55) we see, by the same reasoning as in 
the case of a real argument, that two single-valued solutions X and X are 
always connected in some part of the domain G by the formula 


x=X0, 

where C is a constan: matrix. This formula remains valid under any 
analytic continnation of the functions X(z) and X(z) in G. 

The proof of the theorem on the existence and (for given initial values) 
uniqueness of the solution of (54) is similar to that of the real case. 

{Let us consider a simply-connected star domain G, (relative to z,) 
forming part of G and let the matrix function P(z) be regular** in Gi. We 
form the series 


E+/{[PQ)ag+f(PCyf[P ed d+. (59) 


33 A domain‘is called a star domain relative to.a point 2 if every segment joining 2 
to an arbitrary point 2 of the domain lies entirely in the given domain. 


34 J.e., all the elements pau(e) (i,k ==1, 2,...,”) of P(e) are regular functions in G,, 


§ 7. DIFFERENTIAL SYSTEMS IN COMPLEX DoMAIN 137 


Since G, is simply-connected, it follows that every integral that occurs in 
(59) is independent of the path of integration and is a regular function in 
G,. Since G, is a star domain relative to z, we may assume for the purpose 
of an. estimate of the moduli of these integrals that they are all taken along 
the straight-line segment joining 2, and z. 

That the series (59) converges absolutely and uniformly in every closed 
part of G, containing z, follows from the convergence of the majorant 


l ” I ni 
+ 1M + 5 PMA + 5 PMP +. 


Here M is an upper bound for the modulus of P(z) and / an upper bound 
for the distance of z from 2), and both bounds refer to the closed part of 
G, in question. 

By differentiating term by term we verify that the sum of the series 
(59) is a solution of (55). This solution is normalized, because for z= 2 
it reduces to the unit matrix E. The single-valued normalized solution of 
(55) will be called, as in the real case, a matricant and will be denoted by 
QP). Thus we have obtained a representation of the matricant in G, in 
the form of a series*® 


6 é ¢ 
Q(P=E+ (Pyar f[PO) [PC ya’ a+---. (60) 
bo te fo 


The properties 1.-4. of the matricant that were set up in § 5 automatically 
carry over to the case of a complex argument. 


Any solution of (55) that is regular in G and reduces to the matrix X, 
for z= 2 can be represented in the form 


X=Q1(P)-C (C=X,). (61) 


The formula (61) comprises all single-valued solutions that are regular 
in a neighborhood of Zp (2) is a regular point of the coefficient matrix P(z) ). 
These solutions when continued analytically in G@ give all the solutions of 
(55) ; 1.e., the equation (55) cannot have any solutions for which z) would 
be a singular point. 

For the analytic continuation of the matricant in G@ it is convenient to 
use the multiplicative integral. 


35 Our proof for the existence of a normalized solution and its representation in G, 
by the series (60) remains valid if instead of the assumption that the domain is a star 
domain we make a wider assumption, namely, that for every closed part of G, there exists 
a@ positive number / such that every point z of this closed part can be joined to 2 by a path 
of length not exceeding I. 
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§ 8. The Multiplicative Integral in a Complex Domain 


1, The multiplicative integral along a curve in the complex plane is definec 
in the following way. 

Suppose that L is some path and P(z) a matrix function, continuous on 
L. We divide the path L into ” parts (20, 21), (21,22), ---, (Zn—1, 2n) ; here 
2 is the beginning, and 2, =z the end of the path, and 2), 22,..., 2,1 are 
intermediate points of division. On the segment z2,_12, we take an arbitrary 
point ¢, and we use the notation 4z,=2,— 2-1, (kK=1, 2,..., n). .We 
then define 


J (+ P(e)dz|=lim (H+ P(,) Aa] +++ [B+ PCC) da. 
L AzE-> 


When we compare this definition with that on p. 132, we see that they 
coincide in the special case where FL is a segment of the real axis. However, 
even in the general case, where L is located anywhere in the complex plane, 
the new definition may be reduced to the old one by a change of the variable 
of integration. 

If 

2=—2(t) 
is a parametric equation of the path, where z(t) is a continuous function 


in the interval (t),¢) with a piece-wise continuous derivative lag then it is 


easy to see that sa 
[tes Pad =f {e+ Pleorg ay. 


This formula shows that the multiplicative integral along an arbitrary 
path exists if the matrix P(z) under the integral sign is continuous along 
this path.°¢ 


2. The multiplicative derivative is defined by the previous formula 


DX =% x4. 


Here it is assumed that X(z) is an analytic function. 

All the differential formulas (I-ILT ) of the preceding section carry over 
without change to the case of a complex argument. As regards the integral 
formulas IV-VI, their outward form has to be modified somewhat: 


36 See footnote 26. Even when P(z) is continuous along Z. the function Pla(t)) 4 
may only be sectionally continuous. In this case we can split the interval (t,¢) int« 
partial intervals in each of which the derivative “ is continuous and can interpret thi 
integral from t) to t as the sum of the integrals along these partial intervals. 
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Iv’. f (B+ Pde)=/ (E+ Pdz)/ (B+ Pde). 


(4+L/) Lv Lv 
W. f(E+ Pdz)=(|f (E+ Paz)”. 
—L L 


VI’. fu + CPC—1 dz) =C fu + Pdz)C— (Cis a constant matrix). 
i L 


In IV’ we have denoted by L’ + L” the composite path that is obtained 
by traversing first L’ and then L’”. In V’, — L denotes the path that differs 
from LE only in direction: 

The formula VII now assumes the form 


VII’. fu + (Q + D,X) dz] =X (z) fa + X-'1QX dz) X (2). 
L L 


Here X(z,) and X(z) on the right-hand side denote the values of X(z) at 
the beginning and at the end of Z, respectively. 
Formula VIII is now replaced by the formula 


VIII’. mod [/ (H+ Pdz)—[ (E+ Q de)| <— em (e!—1)1, 
L L 


where mod Q S q/, mod (P— Q) =d-I,1=|| 1 ||, and 1 is the length of L. 
VIII’ is easily obtained from VIII if we make a change of variable in the 
latter and take as the new variable of integration the arc-length s along L 


: dz 
(with ds 


=1), 


3. As in the case of a real argument, there exists a close connection between 
the multiplicative integral and the matricant. 

Suppose that P(z) is a single-valued analytic matrix function, regular 
in G, and that G, is a simply-connectéd domain containing 2). and forming 
part of G. Then the matricant 2;,(P) is a regular function of z in Gp. 

We join the points z) and z by an arbitrary path L lying entirely in Go 
and we choose on ZL intermediate points 21, Z2,..., 2n—-1. Then, using the 
equation 

OF = Ot, + QAO, 


Lo 


and proceeding to the limit exactly as in § 6 (p. 132), we obtain: 
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Q1,(P) = [ (E+ P dz) ={* (E+ Pda). (62) 
L 


From this formula it is clear that the multiplicative integral depends not 
on the form of the path, but only on the initial point and the end point if 
the whole path of integration lies in the simply-connected domain G, within 
which the integrand P(z) is regular. In particular, for a closed contour L 
in Go, we have: 


$e + P dz) =E. (63) 


This formula is an analogue to Cauchy’s well-known theorem acecrding 
to which the ordinary (non-multiplicative) integral along a closed contour 
is zero if the contour lies in a simply-connected domain within which the 
integrand is regular. 


4. The representation of the matricant in the form of the multiplicative 
integral (62) can be used for the analytic continuation of the matricant 
along an arbitrary path L in G. In this case the formula 


X=! (E+ Pdz) X, (64) 
gives all those branches of the many-valued integral matrix X of the differ- 


‘ . ax 
ential equation 7> = PX that for z= 2) reduce to X, on one of the branches. 


The various branches are obtained by taking account of the various paths 
joining 2 and z. 
By Jacobi’s formula (56) 


[ tr Pas 
| X | =| Xo | e* 


and, in particular, for X,= E, 


a 
tr Pds 


in (E + P dz) | es Fe (65) 


From this formula it follows that the multiplicative integral is always 
a non-singular matrix provided only that the path of integration lies entirely 
in a domain in which P(z) is regular. 

If Z is an arbitrary closed path in G and @ is not a simply-connected 
domain, then (63) cannot hold. Moreover, the value of the integral 
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‘ (Z + P dz) 


is not determined by specification of the integrand and the closed path of 
integration L but also depends on the choice of the initial point of integra- 
tion z, on L. For let us take on the closed curve L two points 2 and 2; and 
let us denote the portions of the path from Zz, to 2; and from 2, to 2 (in the 
direction of integration) by L, and Le, respectively. Then, by the for- 
mula IV’,*? 


and therefore 
~~ ‘ ~ Pan 1 
paf-$-f . (66) 
ty Jy 


The formula (66) shows that the symbol ¢ (H# + Pdz) determines a cer- 
tain matrix to within a similarity transformation, i.e., determines only the 
elementary divisor: of that matrix. 

We consider an element X(z) of the solution (64) in a neighborhood of 
Zo. Let L bean arbitrary closed path in G beginning and ending at z. After 
analytic continuation along Z the element X(z) goes over into an element 
X(z). But the new element XY(z) satisfies the same differential equation 
(55), sineé P(z) is a single-valued function in G. Therefore 


X=X?7, 


where V is a non-singular constant matrix. From (64) it follows that 
F()=¢ (E + Pdz) X,. 
&e 
Comparing this equation with thé preceding one, we find: 
v= Xx p (E + Pdz) X,. (67) 
£0 
In particular, for the matricant X =22;,, we have X, = E, and then 


V= (B+ Pay). ; (68) 


87 To simplify the notation we have omitted the expression to be integrated, E -+ Pde, 
which is the same for all the integrals. 


Pod 
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§ 9. Isolated Singular Points 


1. We shall now deal with the behavior of a solution (an integral matrix) 
in a neighborhood of an isolated singular point a. 
Let the matrix function P(z) be regular for the values of z satisfying 
the inequality 
O<l|ze—al<R. 


The set of these values forms a doubly-connected domain G. The matrix 
function P(z) has in G an expansion in a Laurent series 


+ co 
P()= D>» P,(z—a)”. (69) 


An element X(z) of the integral matrix, after going once around a in 
the positive direction along a path L, goes over into an element 


Xt (z)=X(2)V, 


where V is a constant non-singular matrix. 
Let U be the constant matrix that is conneeted with V by the relation 


V=e70, (70) 


Then the matrix function (ze—a)¥ after going around a along L goes 
over into (g—a)¥V. Therefore the matrix function 


F (z) = X (z) (z—a)-" , (71) 


which is analytic in G, goes over into itself (remains unchanged) by analytic 
continuation along L.*® Therefore the matrix function F(z) is regular in G 
and can be expanded in @ in a Laurent series 


+ 00 
F(i2= > F,(z—a). (72) 


i= =—900 


From (71) it follows that: 
X (2) =F (z) (e—a)”. (73) 


Thus every integral matrix X(z) can be represented in the form (73), 
where the single-valued function F(z) and the constant matrix U depend on 


38 Hence it follows that when z traverses any other closed path in G, the function F(z) 
returns to its original value. 
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the coefficient matrix P(z). However, the algorithmic determination of U 
and of the coefficients F, in (72) from the coefficients P, in (69) is, in 
general, a complicated task. 

_A special case of the problem, where 


P (2) = 2 Prle—ay 


will be analyzed completely in §10. In this case, the point a is called a 
reguar singularity of the system (55). 
If the expansion (69) has the form 


P (z) = a Pe a)” (¢g>1; P_,*0) 

then a is called an irregular singularity of the type of a pole. Finally, if 
there is an infinity of non-zero matrix coefficients P, with negative powers. 
of z— a in (69), then a is called an essential singularity of the given differ- 
ential system. 

From (73) it follows that under an arbitrary single circuit in the posi- 
tive direction (along some closed path Z) an integral matrix X(z) is multi- 
plied on the right by one and the same matrix 


V =e2"0 


If this circuit begins (and ends) at 2, then by (67) 


V=X(%)2 $e + Pdz) X(%).- (74) 


Zo 


If instead of X(z) we consider any other integral matrix xX (z) =X(z)C 
(C is a constant matrix; | C | 0), then, as is clear from (74), V is replaced 
by the similar matrix 


V=CVve 
Thus, the ‘integral substitutions’ V of the given system form a class of 


similar matrices. 
From (74) it also follows that the integral 


‘s (E + Pdz) (75) 


£o 


is determined by the initial point 2) and does not depend on the form of the 
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curved path.*® If we change the point 2), then the various values of the 
integral that are so obtained are similar.*° 

These properties of the integral (75) can 
also be confirmed directly. For let Z and L’ 
be two closed paths in G around z =a with the 
initial points zo and 20 (see Fig. 6). 

The doubly-connected domain between L 
and L’ can be made simply-connected by intro- 
ducing the cut from 2 to 2’. The integral 
along the cut will be denoted by — 


Zo 


od : . 
T = [? (H+ Paz). Fig. 6 


Since the multiplicative integral along a closed contour of a simply- 
connected domain is EF, we have 


man Fol 

f T1=E; 
“eos 
hence 


-_~ 


f=arf[Tm. 
L 


L’ 


Thus, the integral f(H + Pdz), like V, is determined to within similarity, 
and we shall occasionally write (74) in the form 
Vx $ (EH + Pdz); 


meaning that the elementary divisors of the matrices on the left-hand and 
right-hand sides of the equation coincide. 


2. As an example, we consider a system with a regular singularity 


aX 


where 
oe eee : 
P(z)=——* + D' P,(z—a)". 
. #0 
Let ‘ 
Q (2)= —- 
ee es 


39 Under the condition, of course, that the path of integration goes around @ once in 
the positive direction. 
40 This follows from (74), or from (66). 
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Using the formula VIIT’ of the preceding section, we estimate the modulus 
of the difference 


D= (H+ Pa)—$ (B+ Qa), (76) 


taking as path of integration a circle of radius r (r < #) in the positive 
direction. Then with 


mod P_,<p.,J, mod SP,e—a)* sal, I=||1|[; 


|z—a| =r n=O 
we set in VIII’: 
==. d=d(r), l=2zxr 


and then obtain — 
mod D <— e??-1 (e2*™rd(r)__ 1) 7, 


Hence it is clear that*! 


lim D=0O. (77) 
On the other hand, the system 
adY 
ree 


is a Cauchy system, and in that case we have for an arbitrary choice of the 
initial point 2) and for every r< R 


§w + Qdz)=e?"P-, 


Od) 


Therefore it follows from (76) and (77) that: 


lim p (E+ Pde) = oP, (78) 


But the elementary divisors of the integral $ (£ + Pdz) do not depend on 


Z) and r and coincide with those of the integral substitution V. 

From this Volterra in his well-known memoir (see [374]) and his book 
[63] (pp. 117-120) deduces that the matrices V and e””?-" are similar, so 
that the integral substitution V is determined to within similarity by the 
‘residue’ matrix P_3. 

But this assertion of Volterra is incorrect. 


41 Here we have used the fact that for a suitable choice of d(7) 
lim d(r) == do, 
r-o0 


where d) is the greatest of the moduli of the elements of Po. 
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_ From (74) and (78) we can only deduce that the characteristic values 
of the integral substitution V coincide with those of the matriz e’™?—, How- 
ever, the elementary dcivisors of these matrices mav he distinct. For example, 
for every 1 5 0 the matrix , 


ee 
0 «@ 


2 


has one elementary divisor (A— a), but the limit of the matrix for r-— 0 
i.e., the matrix || § ||, has two elementary divisors 4A—a, A— a. 

Thus, Volterra’s assertion does not follow from (74) and (78). It is not 
even true in general, as the following example shows. 

Let 


1 


z 


10° 0 
P= i 


> al 


The corresponding system of differential equations has the form: 


dz dz, By 


cad oe aes 
dz v2: dz 2° 


Integrating the system we find: 


a, ==celnz ‘+ d, y= 
The integral matrix 
X(2) Inz 1 
z)= 
z1 0 


When the singular point z= 0 is encircled once in the positive direction, is 
multiplied on the right bv the matrix 
1 0 
Qni 1) 


This matrix has one elementary divisor (A—1)?. At the same time the 
matrix 


Y= 


enranealt la]! lap 


has two elementary divisors 4—1, 4—1. 


3. We now consider the case where the matrix P(z) has a finite number 
of negative powers of z—a (a is a regular or irregular singularity of the 
type of a pole) : 
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P(e) = sty te tk Li S'P.(2—ay*  (q21; P_0). 


n=O 


‘We transform the given system 


dX 


= = PX (79) 


Dy serene, X=A(2)Y, (80) 


where A(z) is a matrix function that is regular at z= 0 and assumes there 
the value E: 


A (z)= E+ A, (g—a) + Ag (2 —a@)? + 


the power series on the right-hand side converges for |z—a]| <1. 

The well-known American mathematician G. D. Birkhoff has published 
a theorem in 1913 (see [117]) according to which the transformation (80) 
can always be chosen such that the coefficient matrix of the transformed 
system 


dY ’ 


contains only negative powers of z—a: 


P* (2) = =5 +: nee, 


Birkhoff’s theorem with its complete proof is reproduced in the book 
Ordinary Differential Equations, by E. L. Ince.*? Moreover, on the basis 
of these ‘canonical’ systems (79’) he investigates the behavior of the solution 
of an arbitrary system in the neighborhood of a singular point. 

Nevertheless, Birkhoff’s proof contaans an error, and the theorem is not 
true. AS a counter-example we can take the same example by which we 
have above refuted Volterra’s claim.*® 

In this example g=1, a=0 and 


fp af ne 


42 See [20], pp. 632-41. Birkhoff and Ince formulate the theorem for the singular 
point g==- ©. This is no restriction, because every singular point g—a can be carried by 
the transformation 2’ = 1/(z2— a) into 2’ == oo. 


0.1 
0 0 


i P,=O forn=1,2,... 


43In the case q==1 the erroneous statement of Birkhoff coincides in essence with 
Volterra’s mistake (see p. 145). 
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Applying Birkhoff’s theorem and substituting in (79) the product 


AY for X in (79), we obtain after replacing a by -_ and cancelling ¥: 
Pr, dA 
A So gs As 


Equating the coefficients of 1/z and of the free terms we find: 
Setting 
Aj —_— 


2 
e dil’ 


a 7 0 Oo; {jo 1 
c 0 —e. —d|| ||o off 
This is a contradictory equation. 
In the following section we shall examine, for the case of a regular singu- 


larity, what canonical form the system (79) can be transformed into by 
means of a transformation (80). 


we obtain: 


§ 10. Regular Singularities 


In studying the behavior of a solution in a neighborhood of a singular point 
we can assume without loss of generality that the singular point is z= 0.** 


1. Let the given system be 


t* =P (2) X, (81) 
where 
P(z) =P 4. SP, (82) 
8 mamQ . 
and the series x P,, 2" converges in the circle | 2| < r. 
men0 
We set 
X=A(z)Y, (83) 
where 
. A (z)=E + Ayzt+ Agz* + -°°. (84) 


44 By the transformation 2’ = z — a or 2’ = 1/¢ every finite point s =a or == © can 
be carried into e’ = 0. 
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Leaving aside for the time being the problem of convergence of the series 
(84), let us try to determine the matrix coefficients ‘A,, such that the trans- 
formed system : 


LY . 
=P OY, (85) 
where 
Pt ()=2at4 SPL (86) 
m=Q 


is of the simplest possible. (‘canonical’) form.** 
When we substitute the product AY for X in (81) and use (85), we 
obtain : 


A(z) P*(e) ¥ + 2 Y=P(@)AQY. 
Multiplying both sides of the equation by Y~—1 on the right we find: 


P (z) A (2) —A (2) P* (2) =< 


When we replace here P(z), A(z), and P*(z) by the series (82), (84), 
and (86) and equate the coefficients of equal powers of z on the two sides, 
we obtain an infinite system of matrix equations for the unknown coefficients 
Al, Aa, eee an? 


}. Bg Pos 
2. P_,A,;—A,(P_,+ F)+ Po=Py, 
3. P_,A,— A,_(P_, + 2) + P,A,—A,Po+P,=Pi, 
(m+ 2). P_yAmss ~ Amer [Pia + (™ + 1) #] + 
+ PyA,,--Amlo + PyAmis Ama Pi ttt Pa = Bt 


(87) 


2, We consider several cases separatei 


1. The matrix P_, does not have ‘istinct characteristic values that 
differ from each other by an integer. 


ose eer 
45 We shall aim at having only a finite number (and indeed the smallest possible 
number) of non-zero coefficients Px in (86). 


46 In all the equations beginning with the second ve replace P-; by P«. in accordance 
with the first equation. 
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In this case the matrices P_,; and P_,+kE do not have characteristic 
valnes in common for any k = 1, 2, 3,..., and therefore (see Vol. I, Chapter 
VIII, § 3)*" the matrix equation 


P_,U—U(P_,+kE)=T 


has one and only one solution for an arbitrary right-hand side T. 
We shall denote this solution by 


®, (P_,,T). 


We can therefore set all the matrices P;, (m=0, 1, 2,...) in (87) equal to 
zero and determine Aj, Ag, ... successively by,means of the equation 


\ 
A,;=9@,(P_,,—P)), A,=@,(P_,,—P,—P,A,), -.-- 
The transformed system is then a Cauchy system 


dY Py 
ir ae 
and so the solution X of the original system (81) is of the form* 
X =A (z) z?-1, (88) 


2. Among the distinct churacteristic values of P_, there are some whose 
difference 1s an integer ; furthermore, the matriz P_, is of simple structure. 

We denote the characteristic values of P_; by 41, Ao, ..., An and order 
them in such a way that the inequalities 


Re (41) = Re (Az) 2... 2 Re (dn) (89) 
hold. 


47 However, we can also prove this without referring to Chapter VIII. The proposi- 
tion in which we are interested is equivalent to the statement that the matrix equation 


P30 =U (P-.1+ kE) (*) 


has only the solution U =O. Since the matrices P-, and P-1 + KE have no characteristic 
values in common, there exists a polynomial f(A) for which 


f(P +) =O, (Pa + kE) =F. 

But from (*) it follows that 
f(Pa)U=Uf(P1a+ kk). | 
Hence U = O. 


48 The formula (88) defines one integral matrix of the system (81). Every integral 
matrix is obtained from (88) by multiplication on the right by an arbitrary constant 
non-singular matrix C. , 
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Without loss of generality we can replace P_; by a similar matrix. This 
Collows from the fact that when we multiply both sides of (81) on the left 
yy a non-singular matrix T and on the right by 7'—!, we in fact replace all 
the Py, by TP,»T~1 (m=—1, 0, 1, 2, ...); ‘moreover, X is replaced by 
(XT-1. Therefore we may assume in this case that P_, is a diagonal 


matrix: 
P_,=|| 4,4, ||?. (90) 


We introduce a notation for the elements of Pn, P,, and An: 
P,=|l PP |, Par=ll ee lt, 4m =|] 2 [It (91) 


In order to determine A;, we use the second equation in (87). This 
matrix equation can be replaced by the scalar equations 


(A; — —A,— 1) al? + vf? = pf” (2, k =I, 2, eee oy n) (92) 
If none of the differences 1; — A; is 1, we can set Po =O. We then have 


from (872) that A, = = @,(P_3 == Po) Bg 
In that case the elements of A; are uniquely determined from (92) : 


Pix eee 93 
xy = — 77, 1 (t,=1,2,..., m). (93) 
But if for some® 7, k 
A; ae Ay =1 ’ 
then the corresponding Py is determined from (92): 
AD =o, 


and the corresponding x can be chosen quite arbitrarily. 
For those + and & for which A, — A; 4 1 we set: 


po” =0, 


and find the corresponding ai) from (93). 
Having determined A;, we next determine Az from the third equation 
of (87). We replace this matrix equation by a system of n? scalar equations: 
(4; — A, — 2) af = pf" — pP —(Pod,—4,Pola (94) 
(3,4 =—1,2,... , n). 


Here we proceed exactly as in the determination of A. 


49 We use the rotation introduced in dealing with the case 1. 
50 By (89) this is only possible for i << k. 
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If A, — A, <2, then we set: 
py” =0; 
and find from (94): 
1 ° 
we mee -E=y GS) [pte —(PoAy — A, P9) yu). 
But if 4; — 4, = 2, then it follows from (94) for these ¢ and k that: 
py” = pP + (PoA1— ArPoa- 


In this case i?) is chosen arbitrarily. 

Continuing this process we determine all the matrices P*,, Pj, Pj ,..- 
and A,, Ao, ... in succession. 

Furthermore, only a finite number of the matrices P, is different from 
zero and, as is easy to see, P®(z) is of the form*™ 


r 
= Aye —"l... ,,2°° **— 1 
4 ty—An—) 
pr = t+ Gan? ; (95) 
0 0 An 


z 
where a= 0, when 4,— A; is not a positive integer, and a,, =p, “#~""", 
when A, — A, is a positive integer. 
We denote by m, the integral part of the numbers Re 4, :*? 
m, = [Re (A,)] (¢=1, 2,..., 2). (96) 
Then, by (89), 


Mm, 2M, =°'' 2mM,. 


If A,—A. is an integer, then 


A; — A, =m,—m,. 


51 P® (m = 0) can be different from zero only when there exis characteristic values A, 
and Ax of P_1 such that A. —Axr. —l—=m (and, by (89), i<k). For a given m there 
corresponds to each such equation an element p,,(™)* = ai. of the matrix P’.; this element 
may be different from zero. All the remaining elements of P% are zero. 


52 T.e., ms is the largest integer not exceeding Re A; (¢== 1, 2,..., 2). 


§ 10. Recuiar SINGULARITIES 153 


Therefore in the expression (95) for the canonical matrix P*(z) we can 
replace all the differences 4; — A, by m,— m,. Furthermore, we set: 


4,=A—m, (=1,2,...,N), (91’) 
Ay A412 O19 

M =| m,6g\l7, = : - a an : (97) 
9 90 An 


Then it follows from (95) (see formula I on p. 134): 


U 


P* (2) =2¥ a ial 5 x =D; (2¥#27). 


Hente Y = 2”z¥ is a solution of (85) and 
X =A (z) 2¥2? (98) 


is a solution of (81).% 

3. The general case. As we have explained above, we may replace P_, 
without loss of generality by an arbitrary similar matrix. We shall assume 
that P_, has the Jordan normal form™ 


P_,=({A,B, + Ay, AgHa + He, ..., 4B, + Hy}, (99) 
with 
Re (A,) = Re (4) 2 +++ = Re(A,). (100) 


Here E denotes the unit matrix and H the matrix in which the elements of 
the first superdiagonal are 1 and all the remaining elements zero. The orders 
of the matrices EZ, and H, in distinct diagonal blocks are, in general, differ- 
ent; their orders coincide with the degrees of the corresponding elementary 
divisors of P_1." 

In accordance with the representation (99) of P_, we split all the 
matrices Pm, Pm, A» into blocks: 


68 The special form of the matrices (97) corresponds to the canonical form of P-1. 
If P-. does not have the canonical form, then the matrices M and U in (98) are similar to 
the matrices (97). 

54 See Vol. I, Chapter VI, § 6. 

55 To simplify the notation, the index that indicuies the order of the matrices is omitted 
from 2. end i. 
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Py =(PEM, P= (PLN, Ag = (XP. 


Then the second of the equations (87) mav be replaced by a system of 
equations 


(A,B; + H,) XP --XP (A, +1) B, + H+ PP= PO” (101) 
which ean also be written as follows: 
(A,—A,— 1) XY + HX — XVPH, + PY =P" G, k= 1,2, ..., wu). (102) 


-Suppose that®é 


%41OTig eo 
x vo see 
Xia Mf all, Pia, Pe” = lhe” I 


Then the matrix equation (102) (for fixed 1 and k) can be replaced by a 
system of scalar equations of the form*’ 


(A,— A, —1) ay, + Xs44,07 ta + p= = py (103) 


61 8 cg OP FR 1, oy 6s) Oo a SB — Os 


where v and w are the orders of the matrices 1,E,; + H,and A, Hy, + H xin (99). 


If A; — A, 1, then in (103) we can set all the p,,* equal to zero and 
determine all the z,; uniquely from the recurrence relations (103). This 
means that in the matrix equations (102) we set 


PO" =@Q 


and determine x uniquely. 
If 4, — 4, =1, then the relations (103) assume the form 
SS el oe i p= mn (104) 
(%41,1=%, 9 =0; ¢ =1,2,..., 0; #=1,2,..., w). 


56 To smphey the notation, we omit the indices i, k in the elements of the matrice 
As, Pp}? fa . 

87 The reader should bear in mind the properties of the matrix H that were developed 
on pp. 13-15 of Vol. I. 
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It is not difficult to show that the elements z, of X() can be determined 
from (104) so that the matrix PY “has, depending on its dimensions (v X w), 
one of the forms 


|a@ 0 eis a, 0 a, 0 2 - 00.,..0 
oe 28 0 GQ, ap ese 20 Oi 4550 


Go U2 --+ Gy Ao_s a @9...0 
(v=w) (v < w) 
0 0 - - O 
0 0...9 
ao 0 0 
me - ese we OO (105) 


GQy1 +--+ Gy Gs 
(v> w) 


We shall say of the matrices (105) that they have the regular lower 
triangular form.®® 

From the third of the equations (87) we can determine Az. This equa- 
tion can be replaced by the system 


(A, — A, — 2) B04 + Hx? — XP HA, + { P9A,— A, Po }a + Py = ar (106) 
(i, #=1, 2, ..., u). 


In the same way that we determine A:, we determine Xie uniquely with 
a = O from (106) provided 4, — A, 2. But if 4,— A, = 2, then xX? can 
be determined so that PY* is of regular lower triangular form. 

58 Regular upper triangular matrices are defined similarly. The elements of x are 
not all uniquely determined from (104); there is a certain degree of arbitrariness in the 
choice of the elements x. This is immediately clear from (102): fur A, —As==1 we 
may add to Xx an arbitrary matrix permutable with Z, i.e., an arbitrary regular upper 
triangular matrix. 
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Continuing this process, we determine all the coefficient mstrices Aj, 
Ao....and P*,, P}. PY,...in succession. Only a finite number of the coef- 
ficients P*, is different from zero, and the matrix P*(z) has the following 
block form :*° 


Zz 
AE, + Hy 1 
Pra={ ° z Bae" Yar) 
0 oo ,, Saat He 
where 
O if 4,— A, is not a positive integer, 
By =) pUi-*-* if 1, — A, is a positive integer. 
All the matrices By, (1,4 =1, 2,..., w;i< k) are of regular lower trian- 
gular form. 
As in the preceding case, we denote by m, the integral part of Re 4; 
m= ([Re(4)]  (6=1,2,..., 4) (108) 
and we set 
A= m,+ 4, (§=1,2,... , u). (108’) 


Then in the expression (107) for P*(z) we may again replace the difference 
Ay-— Ay everywhere by m,—m,. If we intruduce the diagonal matrix M 
with integer elements and the upper triangular matrix U by means of the 
equations® 

A,E, + Hy Bigs es Bra 


M=(mE6,)t, U= : Asli Bae se Boy , (109) 
) O ... AB+H 


then we easily obtain, starting from (107), the following representation of 


P*(z): 
P*(e) a SM + =D, (Me?) 


ii eee es 

59 The dimensions of the square matrices E,, H, and the rectangular matrices By are 
determined by the dimensions of the diagonal blocks in the Jordan matrix P-1, ie., by 
the degrees of the elementary divisors of P-1. 


60 Here the splitting into -blocks corresponds to that of P-: and P*(z). 
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Hence it follows that the solution (85) can be given in the form 
a Ae 
and the solution of (81) can be represented as follows: 
X = A(z) 2¥2". (110) 


Here A(z) is the matrix series (84), M is a constant diagonal matrix whose 
elements are integers, and U is a constant triangular matrix. The matrices 
M and U are defined by (108), (108’), and (109) * 


3. We now proceed to prove the convergence of the series 
A (z) =E + Ayzt Agztteee. 
We shall use a lemma which is of independent interest. 


Lemma: If the serves” 
© =p + ay2 + agz*+--° (111) 
formally satisfies the system 
dz 
— =P(z)e (112) 


for which z=0 is a regular singularity, then (111) converges im every neigh- 
borhood of z=0 in which the expansion of the coefficient matrix P(z) in 
the series (82) converges. 

Proof. Let us suppose that 


P(z)= — + SP, 4, 
q=9 


where the series p> Pm z,, converges for |z|< 7. Then there exist posi- 
tive constants p—1 and p such that® 
mod P_,<p_4, modP,<51, I=||1||  (m=0,1,2,...). (113) 


Substituting the series (111) for x in (112) and comparing the coeffi- 
cients of like powers on both sides of (112), we obtain an infinite system of 
(column) vector equations 


61 See footnote 53. 


62 Here z= (21, Za, ..., Yn) is a column of unknown functions; a, dc, a2, ... are con- 
stant columns ; P(z) is a square coefficient matrix. 


63 For the definition of the modulus of a matrix, see p. 128. 
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sas 


149 = 9, 


(Z— P_,) a= Pod, 
(2 B— P_) a,>= es 


| (114) 
ial Ngee Peas; + Pydy_o +++ + Pm: 
It is sufficient to prove that every remainder of the series (111) 
ah*) =a,2" + a, 2°t+1 + ee (115) 


converges in a neighborhood of z=0. The number k is subject to the 
inequality 
k > Npr1. 


Then k exceeds the moduli of all the characteristic values of P—1,°* so that 
for m = k we have | mE — P_, | +0 and 


= 1 Pes 
(mE — P_,) t=—( —==1\~. =i B+ —P4it+ 1 Ptter (116) 
are ae a 


In the last part of this equation there is a convergent matrix series. With 
the help of this series and by using (114), we can express all the coefficients 
of (115) in terms of do, ai, ... , @x-1 by means of the recurrence relations 


Bn — =(h E+ = 2P_1+73 + Phy te +) (faa + Pot +°*++ Past, 
(m=k,k+1,...) (117) 
where 


bn—1 = Pm—1Ue-1 $°°* + Pr_1% (m=k, k+1,...). (118) 


Note that this series (115) formally satisfies the differential equation 


64 Uf 4, is a characteristic value of A = |lau 2, then } Ao | = n*max Qik |. For let 


Ax w= hot, where = (21, &,..., Zn) 0. Then 1stkaon 


" 
Aot; => pte (s=1,2, ... m). 
kel 


Let | 2) |==max {| |, ||, ..., | an |}. Then 
tc) 
| 4o| [21S Sl ael [ze] S| zy|m max |ay|. 
k=l ist,jkgn 


Dividing through | 2/ |, we obtain the required inequalitv. 
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Coe = P(e) + f (2), (119) 


where 
f= x fn2™ = P(z) (ag + a2 + °° + y_y2*) — 
a a 2age—s— (KD ay ye* ——(120) 
From (120) it follows that the series 


SD fy” 


Muk—1 


converges for | z| < r; hence there exists an integer N > 0 such that® 
rae | 
mod fy < || m=k—1,8, ...).. (121) 


From the form of the recurrence relations (117) it follows that when the 
matrices P_;, Py, fm— 7 these relations are replaced by the majorant 


| | and the column a, by ] Am | (m=k, k +1, 
...;q=0, 1, 2,...),% then we obtain relations that determine upper bounds 
| Om | for mod dm: 


matrices p_i, pr—4, 


mod a,, < || &,, || - (122) 
Therefore the series 


&(4) = az + Op, zt + eee 4 (123) 


after term-by-term multiplication with the column | 1 | becomes a majorant 
series for (115). 
By replacing in (119) the matrix coefficients P_,, Py, fm of the series 


P (2) = "2! +5 PA, f= Sta 
g=0 mun k—} 


by the corresponding majorant matrices p_, I, F f, | =~ i l| €) ||, we obtain 
a differential equation for &): 
bl 
ag®) i ai 
” n (7 a=) en =. (124) 
2 


65 Here || N/r™ || denotes the column in which all the elements are equal to one and the 
game number, V/r™. 
66 Here || ae || denotes the column (am, am, ..., am) (am is a constant, m==k, 


k+1,...). 
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This linear differential equation has the particular solution 


Np. npr—l 
ge — | ze-mr-a-t(1— 2 de, (125) 
0 


~ read 6 —2r- 


which is regular for z=-G and can be expanded in a neighborhood of this 
point in the power series (123) which is convergent for | z| <r. 

From the convergence of the majorant series (123) it follows that the 
series (115) is convergent for | z| < r, and the lemma is proved. 


Note 1. This proof enables us to determine all the solutions of the differ- 
ential system (112) that are regular at the singular point, pre vided such 
solutions exist. 


For the existence of regular solutions (not identically ze -o, vt 1s necessary 
and sufficient that the residue matriz P_, have a non-negatre integral char- 
acteristic value. If s is the greatest integral characteri:, ‘ic value, then columns 
Qo, @1,..., @, that do not all vanish can be dei.“mined from the first s + 1 of 
the equations (114) ; for the determinent of t1e corresponding linear homo- 
geneous equation is zero: 


A=|P_,||E—P_,|---|sE—P_,|=0. 


From the remaining equations of (114) the columns a,4;, @,42, ... can be 
expressed uniquely in terms of do, d1,..., @,. The series (111) so obtained 
converges, by the lemma. Thus, the linearly independent solutions of the 
first s + 1 equations (114) determine all the linearly independent solutions 
of the system (112) that are regular at the singular point z=0. 

if z=0 is a singular point, then a regular solution (111) at that point 
(if such a solution exists) is not uniquely determined when the initial value 
@. is given. However, a solution that is regular at a regular singularity is 
untsuecy determined when dp, a1, ..., @ are given, i.e., when the initial 
values at z= 0 of this solution and the initial values of its first s derivatives 
ste wiven (s is the largest non-negative integral characteristic value of the 
residue matrix P_;). 

Note 2. The proof of the lemma remains valid for P_1;=Q. In this 
ease an arbitrary positive number can be chosen for p_,; in the proof of the 
lemma. For P_,;=O the lemma states the well-known proposition on the 
existeace of a regular solution in a neighborhood of a regular point of the 
system. In this case the solution is uniquely determined when the initial 
value dy 1s given. . 


4. Suppose given the system 
= P(X, (126) 
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where 
P()="= + SP, 2” 
t m=O 
and the series on the right-hand side converges for lJel<r. 
Suppose, further, that by setting 


X=A(z)Y (127) 
and substituting for A(z) the series 
A (z) = Ag + Ay2 + Agz2tes> , (128) 
we obtain after formal transformations: 


=P ()Y, (129) 
where 


P* (z) =" ee SD Pi, 
m=( 


and that here, as in the expression for P(z), the series on the right-hand 
side converges for | z| <r. 

We shall show that the series (128) also converges in the neighborhood 
|z| << rof 2=0. 

Indeed, it follows from (126), (127), and (129) that the series (128) 
formally satisfies the following matrix differential equation 


of =P (2) A—AP*(2). (130) 


We shall regard A as a vector (column) in the space of all matrices of 
order n, i.e., a space of dimension n”. If in this space a linear operator 
P(z) on A, depending analytically on a parameter z, is defined by the 
equation 

P(z) [A] =P(z) 4—AP*(2), (131) 


then the differential equation (130) can be written in the form 
dA -= ; 
EP (2) [A]. (132) 


The right-hand side of this equation can be considered as the product of 
the matrix « (z) of order n? and the column A of n? elements. From (131) 
it is clear that z= 0 is a regular singularity of the system (132). The series 
(128) formally satisfies this system. Therefore, by applying the lemma, we 
conclude that (128) converges in the neighborhood | z| < r of z=0. 
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In particular, the series for A(z) in (110) also converges. 
Thus, we have proved the following theorem : 
THrorEM 2. Every system 


o* = P(e) X, (133) 


with a regular singularity at z=0 


P(z) ==! + JP na”, 


has a solution of the form 
X =A (z)2¥2? , (134) 


where A(z) is a matrix function that 1s regular for z=0 and becomes the 
unt matric E at that point, and where M and U are constant matrices, M 
being of simple structure and having integral characteristic values, whereas 
the difference between any two distinct characteristic values of U ts not an 
enteger. 

If the matrix P_, ts reduced to the Jordan form by means of a non- 
singular matrix T 


P_j»=T{A, 2, + Ay, AH, + Hy, ...,A, #, + H,} T (135) 
(Re (A,) 2 Re (A,) 2 --- 2 Re(A,)), 


then M and U can be chosen in the form 


M=T (m,Ey, mE ...,m,H,) T, (136) 
1, EB, +H, _ Bet us: By 
Uu=T O A,H,+ Hs... Bs, T), (137) 
O O A,E, + H, 
where 
m,=[A], .a=A—m  (#=1, 2,...,48). (138) 
The By, are regular lower triangular matrices (1, k =I,2,...,8) and By, =O 


of 4; — A, ts not a positive integer (4,k=1,2,...,8). 

In the particular case where none of the differences 4, — A, (t,k =1, 2, 
3,...,8) is @ positive integer, we can set in (134) M=O and U=P_1; te., 
om this case the solution can be represented in the form 


X =A (2) 2?-1, (139) 
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Note 1. We wish to point out that in this section we have developed an 


algorithm to determine the coefficients of the series A(z) = = A,,2” 


(A,=E£E) in terms of the coefficients P,, of the series for P(z). Moreover, 
the theorem also determines the integral substitution V by which the solution 
(134) is multiplied when a circuit is made once in the positive direction 
around the singular point z=0: 


V =e? | 
Note 2. From the enunciation of the theorem it follows that 
By=O for AXA, (i, F=1,2,...,8). 


Therefore the matrices 


O By B;, 
A=T{A,By AB, ...,48}T? ana T=T{? 2 «+ Bee] ps (140) 
00 O 
are permutable: 
AU=UA. 
Hence ay aes : 
eM W a= Mg At U = gM zAgl — 2Ae (141) 
where 2 
A=M+A=TY{A,, A, ...,4,} T (142) 
and where 4,, Jo, ..., 4, are all the characteristic values of P_; arranged 
in the order Re A, = Resp =... = Red. 
On the other hand, 
27—=h(U), 


where h(A) is the Lagrange-Sylvester interpolation polynomial for f(1) = 24. 

Since all the characteristic values of U are zero, h(A) depends hnearly 
on f(0), f’(0), ..., f99- (0), ie, on 1, Inz,..., (Inz)9-! (g is the least 
exponent for which 07 =O) Therefore 

g—-1 
h(ay= 2 h, (A) (In 2)! 
and 
1 4s +++ Qin 


ha =~ aah ays 0 ] eee don -1 
22=h(U)= 3) h,(U) (Inzv=T T, (143) 
j=mO 
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where @ (4j=1L 2,...,”;%< 7) are polynomials in In z of degree less 


than g. 
By (134), (141), (142), and (243) a particular solution of (126) can be 


chosen in the form 


zh 0 ...0 1 dig +++ Gin 
X=A(z) 0 2%...0 0 1... den (144) 

C 0 ...2*|| [0 0 1 | 
Here d,, Ao,..., An are the characteristic values of P_, arranged in the order 
Re 4, = Reke =... 2 Red, and qy (4, j9= 1, 2, ..., 0; 4+< Jj) are poly- 


nomials in In z of degree not higher than g — 1, where g is the maximal num- 
ber of characteristic values 4, that differ from each other by an integer ; A(z) 
is a matrix function, regular at z= 0, and A(0) =T (|7|#0). If P_, has 
the Jordan form, then T= E. 


§ 11. Reducible Analytic Systems 


1. Asan application of the theorem of the preceding section we shall investi- 
gate in what cases the system 


=) X, (145) 
where 
__ —wQm 
QW= Se (148) 


is a convergent series for ¢ > fo, is reducible (in the sense of Lyapunov), 1.e., 
in what cases the system has a solution of the form 


X=L(t)e%, (147) 


where L(t) is a Lyapunov matrix (i.e., L(t) satisfies the conditions 1.-3. on 
p. 117) and B is a constant matrix.*” Here X and Q are matrices with 
complex elements and ¢ is a real variable. 

We make the transformation 


, 


t e 
67 If the equation (147) holds, then the Lyapunov transformation X = L(t)Y carries 
the system (145) into the system <f — BY, 
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Then the system (145) assumes the form 


ox = P(2) X, (148) 


where 


l 

P(@)=—2-7Q(+)=-S_— D> Umi22"- (149) 
manQ 

The series on the right-hand side of the expression for P(z) converges for 

|z| <1/to. Two cases can arise: 


1) Q,=90. In that case z= 0 is not a singular point of the system (148). 
The system has a solution that is regular and normalized atz=0. This solu- 
tion is given by a convergent power series 


X (2) = B+ X24 Xgeites (a <<): 
Setting 
1 


Li)=X(;z), B=0, 


we obtain the required representation (147). The system is reducible. 


2) Q: 40. In that case the system (148) has a regular singularity at 
2=0. 

Without loss of generality we may assume that the residue matrix 
P_,=— Q; is reduced to the Jordan form in which the diagonal elements 
Ay, Ao, ..-, dn are arranged in the order Re 1, = Reda 2... = Re A, 

Then in (144) 7 = E, and therefore the system (148) has the solution 


zy 0 ...0 ] -Qig ote Gin 
0 2...0 O 1 ... don 


oe e@© ee ee «©  # FF gf © ee @  e «a @ 


where the function A(z) is regular for z=0O and assumes at this point the 
value EH, and where gy, (t,k=1, 2,.. ,;%t< k) are polynomials in In z. 
When we replace z by 1/t, we have: 


OP os 6 | ft nfed)oneled 
x=4(5) 0 +)" 0 lo 1. don(In> foe 


a 0 SS i 
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Since XY = A(1/t)Y is a Lyapunov transformation, the system (145) is 
reducible to a system with constant coefficients if and only if the product 


[1 1 
#0 0 | | ta(ing) naling) 
L,(i)=| 0 &%... 0 0: 1 ees a)ie™, (151) 
ee ae LOG) We Bt OG te 
0 0 | 0 O:- wes 1 | 


where B is a constant matrix, is a Lyapunov matrix, i.e., when the matrices 


aC eee and L7'(t) are bounded. It follows from the theorem of Erugin 


(§ 4) that the matrix B can be assumed here to have real characteristic 
values. 

Since Z,(¢t) and Ly*(t) are bounded for t > to, all the characteristic 
values of B must be zero. This follows from the expression for e?* and e— 
obtained from (151). Moreover, all the numbers 4,, 42, ..., 4n must be pure 
imaginary, because by (151) the fact that the elements of the last row of 
L,(t) and of the first column of L7*(¢) are bounded implies that Re 1, = 0 
and Re Ay = 0. 

But if all the characteristic values of P_; are pure imaginary, then the 
difference between any two distinct characteristic values of P_, cannot be 
an integer. Therefore the formula (139) holds 


X =A (z)z2?2 =A (=) £2, 


and for the reducibility of the system it is necessary and sufficient that 
the matrix 


Le, (t) = #1 ¢e- 48 (152) 
together with its inverse be bounded for t¢ > ty. 


Since all the characteristic values of B must be zero, the minimal poly- 
nomial of B is of the form A?’ We denote by 


p (A) = (A— py) (A — fag) + (A py) (14 A Mx for 14k) 


the minimal! polynomial of Q;. As Q; =— P_,, the numbers my, pe, ..., py 
differ only in sign from the corresponding numbers 4, and are therefore all 
pure imaginary. Then (see the formulas (12), (13) on p. 116) 
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77 


a= Y [Uj + Uy int +--+ Uy, 4 (In yet] ee, (153) 


k=} 


ett —V,4+Vit+e- +V,_,84-1. (154) 
Substituting these expressions in the equation 
L,(t)e* =, 
we obtain 


[Ly (t) Va_a + (#)] = Zo (t) (In), (156) 


where c is the greatest of the numbers cy, c2,..., Ca, (*) denotes a matrix 
that tends to zero for t—> o, and Z(t) is a bounded matrix for t > to. 
Since the matrices on both sides of (155) must be of equal order of magni- 
tude for t-—» o, we have : 
d=c=], 
1.e.,: 


B=0, 


and the matrix Q; has simple elementary divisors. 
Conversely, if Q; has simple elementary divisors and pure imaginary 
characteristic values p41, y2,..., fn, then 


X= A (z)2z-%=A (2) ||2-"% 4, | 


is a solution of (149). Setting z=1/t, we find: 


X=A(;)|le“Salt. 


The function X(t) as well as : *0 


bounded for ¢ > ty. Therefore the system is reducible (B=-O). Thus we 
have proved the following theorem :* 


and the inverse matrix X—1(t) are 


THEOREM 3: The systen. 


dX 
dt =@ (é) Xx, 
where the matrix Q(t) can be represented im a series convergent for t > t, 


Q()= H+ By.., 


as reducible af and only if all the elementary divisors of the residue matrix 
Q, are simple and all its characteristic values pure wmagmnary. 


68 See Erugin [13]. The theorem is proved for the case where Q, does not have dis- 
tinct characteristic values that differ from each other by an integer. 
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§ 12. Analytic Functions of Several Matrices and their 
Application to the Investigation of Differential Systems. 
The Papers of Lappo-Danilevskii 


1. An analytic function of m matrices X,, X2,..., Xm of order n can be 
given by a series 
(1,..-.m) 
F(X, Xq, ..., Xp) =%qy + Py ; 2 Os jg. os jy Hy, Xj, Xj, (156) 


convergent for all matrices X; of order n that satisfy the inequality 


mod X,<R, (j=1,2,...,m). (157) 


Here the coefficients 


Zo, SH, 4... 4 (n, jar sees ye =I, 2,.-.,m; v=1, 2,3,...) 


are complex numbers, R; (j= 1, 2,..., m) are constant matrices of order n 
with positive elements, and X; (j=1, 2,..., m) are permutable matrices 
of the same order with complex elements. 

The theory of analytic functions of several matrices was developed by 
I. A. Lappo-Danilevskil. He used this theory as a basis for fundamental 
investigations on systems of linear differential equations with rational 
coefficients. 

A system with rational coefficients can always be reduced t:. the form 


as Ug Un fa U;, mlx 


=a ras 


eee 158 
@—a)? | @—ay te 
after a suitable transformation of the independent. variable, where Uy, are 
constant matriccs of order n, a; are complex numbers, and s, are positive 
integers (kK=0, 1, ..., 8;-1; j=1, 2,..., m).© 

We shall illustrate some of Lappo-Danilevskil’s results in the special case 
of the so-called reyular systems. The latter are characterized by the condi- 


tion s; = 2 =...== 8_ = 1 and can be written in the form 
ax ™ JU, 
de = Xa, (159) 


6® In the system (158) all the coefficients are regular rational fractions in zg. Arbi- 
trary rational coefficients can be reduced to this form by carrying a finite point z=c 
that is regular (for all cocfficients) by means of a fractional linear transformation on 2 
into ¢—= oo. 


§ 12. AnaLyTic, FUNCTIONS OF SEVERAL Matrices AND APPLICATIONS 169 


Following Lappo-Danilevskii, we introduce special analytic functions, 
namely hyperlogarithms, which are defined by the following recurrence 
relations : 


- dz 
ly (Zz; @;,) ={*., 


6 


2 
In (25 Aj, Bjyy - + +5 By) = "BES Che Ci 2 OH), 
I 

Regarding a1, @2,..., @m, © aS branch points of logarithmic type, we 
construct the corresponding Riemann surface S(a1, d2,...,@m; 0). Every 
hyperlogarithm is a single-valued function on this surface. On the other 
hand, the matricant 2) of the system (159) (i.e., the solution normalized at 
z=b) after analytic continuation can also be regarded as a single-valued 
function on S(4a1, d2,..., Gm; ©); here 6 can be chosen as an arbitrary finite 
point on S other than ay, de, ... , am. 

For the normalized solution 2; Lappo-Danilevskil gives an explicit ex- 
pression in terms of the defining matrices U,, U2, ..., Um of (159) in the 
form of a series 


oo (1,. ¥ 


F=HE+ a b(2} Gy, @j,, ---, @;,) Us, U;, +++ Uj,. (160) 


vo] div 


This expansion converges uniformly in z for arbitrary U;, U2,..., Um and 
represents 2; in any finite domain on S(a, a2, ..., @m; ©) provided only 
that the domain does not contain a1, de, ..., Gm in the interior or on the 
boundary. 

If the series (156) converges for arbitrary matrices X,, Xo,..., Xm, then 
the corresponding function F(X1, Xo,..., Xm) is called entire. QQ; is an 
entire function of the matrices U;, Us, ee ane 

If in (160) we let the argument z go around the point a; once in the posi- 
tive direction along a contour that does not enclose other points a, (for 
14> j), then we obtain the expression for the integral substitution V; corre- 
sponding to the point z= a,: 


eo (1,.. 


V,=E+ > - (b; a;,, By, »- +5 @;,) Uj,Uy, «.. Uj, (161) 


Wome dl Fy voce fy 


(7 =I, 2, ..., m) 


where in a readily understandable notation 
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dz 
pi(b; a) = [ =, 
(a7) 
es Ly (25 Bier Bfgs + + «> By) de 


Pj (D5 Gj, Aj, - + +, Gi, z— aj, 


(a) 
Ia Ja» eeey dv, 4=1,2, oony " 
v1 2, 3, 2u8 


The series (161), like (160), is an entire function of U;, Uo,..., Um. 


2. Generalis ng the theory of analytic functions to the case”® of a countably 
infinite set of matrix arguments X,, X2, Xs, ..., Lappo-Danilevskii has 
used it to study the behavior of a solution of a system in a neighborhood 
of an irregular singularity.71 We quote the basic result. 


The normalized solution 2; of the system 


ax. 72 
=—g 


where the power series on the right-hand side converges for jel|<r(r>1),” 
can be represented by a series 


QaE+ DS: Df Pees Pi X 


vow fir ter ver Jp — 
Shite Hip SA Pha Ais es, A of) , 
7 Ui fpfees u #(A - In’ yo. 
‘ a : ; a Oe a eed WD Bd Miss vy WO. 
p= = t= £ 


Here ay ] vg, and of”) are scalar coefficients that are defined by 
M+) , 


eT 


special formulas. The series (162) converges for arbitrary matrices P;, Pe, 
...in an annulus 


e<l[z[<r 


(o is any positive number less than r). The point 56 must also lie in this 
annulus (e <|b| <r). | 


a 


70 See [29], Vol. I, Memoir 1. 
71 See [29], Vol. I, Memoir 3. See also [252], [253], [254], [146], and [147]. 


72 The restriction 7 > 1 is not essential, since this condition can always be obtained by 
replacing 2 by az, where a is a suitably chosen positive number. 
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Since in this book we cannot possibly describe the contents of the papers 
of Lappo-Danilevskii in sufficient detail, we have had to restrict ourselves 
to giving above statements of a few basic results and we must refer the reader 
to the appropriate literature 

All the papers of Lappo-Danilevskii that deal with differential equations 
have been published posthumously in three volumes ([29]: Mémoires sur la 
théorie des systemes des équations différentielles linéaires (1934-36) ). More- 
over, his fundamental results are expounded in the papers [252], [253], 
[254] and the small book [28]. A concise exposition of some of the results 
can also be found in the book by V. I. Smirnov [56], Vol. IIT. 


CHAPTER XV 
THE PROBLEM OF ROUTH-HURWITZ AND RELATED QUESTIONS 


§ 1. Introduction 


In Chapter XIV, §3 we explained that according to Lyapunov’s theorem 
the zero solution of the system of differential equations 
, = a 

ot = Saat + (*) (1) 
(du, (4,4 =1, 2,...,) are constant coefficients) with arbitrary terms (**) 
of the second and higher orders in 2, 22,..., Zp is stable if all the character- 
istic values of the matrix A = | yx, ik , 1.e., all the roots of the secular equa- 
tion A(A) =| AE — A |=0, have negative real parts. 

Therefore the task of establishing necessary and sufficient conditions 
under which all the roots of a given algebraic equation lie in the left half- 
plane is of great significance in a number of applied fields in which the 
stability of mechanical and electrical systems is investigated. 

The importance of this algebraic task was clear to the founders of the 
theory of governors, the British physicist J. C. Maxwell and the Russian 
scientific research engineer I. A. Vyshnegradskii who, in their papers on 
governors,’ established and extensively applied the above-mentioned alge- 
braic conditions for equations of a degree not exceeding three. 

In 1868 Maxwell proposed the mathematical problem of discovering cor- 
responding conditions for algebraic equations of arbitrary degree. Actually 
this problem had already been solved in essence by the French mathematician 
Hermite in a paper [187] published in 1856. In this paper he had estab- 
lished a close connection between the number of roots of a complex poly- 
nomial f(z) in an arbitrary half-plane (and even inside an arbitrary 
triangle) and the signature of a certain quadratic form. But Hermite’s 


1J.C. Maxwell, ‘On governors’ Proc. Roy. Soc. London, vol. 10 (1868); I. A. Vyshne- 
gradskii, ‘On governors with direct action’ (1876). These papers were reprinted in the 
survey ‘Theory of automatic governors’ (Izd. Akad. Nauk SSSR, 1949). See also the 
paper by A. A. Andronov and I. N. Voznesenskii, ‘On the work of J. C. Maxwell, I. A. 
Vyshnegradskii, and A. Stodol in the theory of governors of machines.’ 
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results had not been carried to a stage at which they could be used by spe- 
cialists working in applied fields and therefore his paper did not receive 
due recognition. 

In 1875 the British applied mathematici:.n Routh [47], [48], using Sturm’s 
theorem and the theory of Cauchy indices, set up an algorithm to determine 
the number & of roots of a real polynomial in the right half-plane (Rez > 0). 
In the particular case k = 0 this algorithm then gives a criterion for stability. 

At the end of the 19th century, the Austrian research engineer A. Stodol, 
the founder of the theorv of steam and gas turbines, unaware of Routh’s 
paper, again proposed the problem of finding conditions under which all the 
roots of an algebraic equation have negative real parts, and in 1895 A. Hur- 
witz [204] on the basis of Hermite’s paper gave another solution (independ- 
ent of Routh’s). The determinantal inequalities obtained by Hurwitz are 
known nowadays as the inequalities of Routh-Hurwitz. 

However, even before Hurwitz’ paper appeared, the founder of the 
modern theory of stability, A. M. Lyapunov, had proved in his celebrated 
dissertation (‘The general problem of stability of motion,’ Kharkov, 1892)? 
a theorem which yields necessary and sufficient conditions for all the roots 
of the characteristic ecuation of a real matrix A= } Aix Il to have nega- 
tive real parts. These conditions are made use of in a number of papers 
on the theory of governors.’ 

A new criterion of stability was set up in 1914 by the French mathema- 
ticlans Liénard and Chipart [259]. Using special quadratic forms, these 
authors obtained a criterion of stability which has a definite advantage over 
the Routh-Hurwitz criterion (the number of determinantal inequalities in 
the Lienard-Chipart criterion is roughly half of that in the Routh-Hurwitz 
criterion). 

The famous Russian mathematicians P. L. Chebyshev and A. A. Markov 
have proved two remarkable theorems on continued-fraction expansions of a 
special type. These theorems, as will be shown in § 16, have an immediate 
bearing on the Routh-Hurwitz problem. 

The reader will see that in the sphere of problems we have outlined, the 
theory of quadratic forms (Vol. I, Chapter X) and, in particular, the theory 
of Hankel forms (Vol. I, Chapter X, § 10) forms an essential tool. 


§ 2. Cauchy Indices 


1. We begin with a discussion of the so-called Cauchy indices.* 


2 See [32], § 20. 
3 See, for example, {102]. 
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DEFINITION 1: The Cauchy index of a real rational function R(x) 
between the limits a and b (notation: I2R(z); a and b are real numbers or 
+o) is the difference between the numbers of jumps of R(x) from — © to 
+o and that of jumps from +o to — o as the argument changes from 
atob.® 

According to this definition, if 


P 
R(x) = SA + R, (2), 


t=] ees 

where A,, a, (t= 1, 2, ..., m9) are real numbers and #,(z) is a rational 
function® without real poles, then’ 

? 
IS R(x)=_»' sign A, (2) 

and, in general, : = 
I,R(z)=S) sign A, (¢<b). (2’) 

acag<h 


In particular, if f(z) = ay (wv—a@,)"---(x—a@,,)"™ is a real polynomial 
(a, 4 a, fort k; 1,4=1, 2,..., m) and if among its roots aj, az,..., Gm 
only the first p are real, then 


f(z) . f 4 er 
Fey ee at Baa, (2”") 


jm 


where f(z) is a real rational function without real poles. 
Therefore, by (2’): The index 


It re (a<b) 
as equal to the number of distinct real roots of f(x) in the interval (a,b). 


An arbitrary real rational function R(x) can always be represented in 
the form 


P( At) AM 
Ra) = 3 : bot Ae Ns mea 


im1 | 2—ay (x — ai)" 
where all the a and A are real numbers (A“) 40; i=1, 2, ..., p) and 
R,(z) has no real poles. 
Then 


5In counting the number of jumps, the extreme values of z—the limits a and b—are 
not included. 

©The poles of a rational function are those values of the argument for which the 
function becomes infinite. 

7 By signa (a is a real number) we mean + 1, —1, or 0 according as a>0,a< 0, 
or a= 0. 
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- FFE R(x)= 2) sign Al (3) 
and, in general,?® e ("4 odd) 
IR(2)= Am (<0). (3") 
( m odd 


2. One of the methods of computing the index J ’ R(x) is based on the 
classical theorem of Sturm. 
We consider a sequence of real polynomials 


fi(x), fe(x). Peet fm (x) (4) 


that has the two following properties with respect to the interval (a, }) :° 


1. For every value z (a < xz < b), if any f;,(x) vanishes, the two adja- 
cent functions f,-1(z) and f,41(2) have values different from zero and of 
opposite signs; i.e., for a << x < b it follows from f,(2) =0 that 


fu—1(&) fera(z) <0. 


2. The last function f,,(z) m (4) does not vanish in the interval (a, b) ; 
ie, fn(z) 40 fora<z< ob. 


Such a sequence (4) of polynomials is called a Sturm chaan in the wnter- 
val (a,b). 

We denote by V(a) the number variations of sign in (4) for a fixed 
value x.?° Then the value of V(z), as x varies from a to b, can only change 
when one of the functions in (4) passes through zero. But by 1., when the 
functions f,(x) (k=2,..., m—1) pass through zero, the value of V(z) 
does not change. When f;(z) passes through zero, then one variation of 
sign in (4) is lost or gained according as the rat’o fe(x)/f:(<) goes from 
— © to+ o or vice versa. Hence we have: 


THEOREM 1 (Sturm): Jf fi(x), fo(z), ..., fm(x) ts a Sturm chain in 
(a,b) and V(x) is the number of variations of sign in the chain, then 


Lee) =V(a)—V 0). (5) 


8 In (3) the sum is extended over all the values ¢ for which the corresponding mn, is odd. 
In (3’) the sum is extended over all the i for which nm is odd anda <a < BD. 

® Here a may be — co and b may be + oe. 

1OTfa<a< db and fi(z) <0, then by 1. in the determination of V(x) a zero value 
in (4) may be omitted or an arbitrary sign may be attributed to this value. If a is finite, 
then V (a) must be interpreted as V(a-+ ce), where é€ is a positive number sufficiently 
small that in the half-closed interval (a,a@-+e] none of the functions f;:(z) vanishes. 
In exactly the same way, if b is finite, V(b) is to be interpreted as V(b —e), where the 
number e is defined similarly. 
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Note. Let us multiply all the terms of a Sturm chain by one and the 
same arbitrary polynomial d(z). The chain of polynomials so obtained is 
called a generalized Sturm chain. Since the multiplication of all the terms 
of (4) by one and the same polynomiai alters neither the left-hand nor the 
right-hand side of (5), Sturm’s theorem remains valid for generalized 
Sturm chains. 

Note that if f(z) and g(2z) are any two polynomials (where the degree 
of f(x) is not less than that of g(z)), then we can always construct a gen- 
eralized Sturm chain (4) beginning with f,(z) =f(z), fe(z) =g(z) by 
means of the Euclidean algorithm. 

For if we denote by — f;(x) the remainder on dividing f,(z) by fe(x), 
by —/f,(x) the remainder on dividing f,(z) by f3(z), ete., then we have 
the chain of identities 


fy (2) = G1 (2) fa (2) — fg (2), 


fe—1 (%)= Mp—a (2) fy (2) — fea (2), (6) 


oo e» e@® «© ¢ @© @ e® ® &® 6 @® @® @ @ @ \ e e 


Pes (x) = In-1 (x) be (x) ’ 


where the last remainder f,(xz) that is not identically zero is the greatest 
common divisor of f(z) and g(x) and also of all the functions of the 
sequence (4) so constructed. If fm(x) ~0 (a < x < 5b) then this sequence 
(4) satisfies the conditions 1., 2. by (6) and is a Sturm chain. If the 
polynomial f(z) has roots in the interval (a,b), then (4) is a generalized 
Sturm chain, because it becomes a Sturm chain when all the terms are 
divided by fm(z). 

From what we have shown it follows that the index of every rational 
function R(x) can be determined by Sturm’s theorem. For this purpose 
it is sufficient to represent F(z) in the form Q(z) + ot where Q(z), f(z), 
g(x) are polynomials and the degree of g(z) does not exceed that of f(z). 
If we then construct the generalized Sturm chain for f(z), g(x), we have 


_. yr g(%) __ = 
TaR (#) = Tags = V (a) — V (0). 

By means of Sturm’s theorem we can determine the number of distinct 
real roots of a polynomial f(z) in the interval (a,b), since this number, as 


we have seen, is’ P(e) 
@ f(x) 
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§ 3. Routh’s Algorithm 


1. Routh’s problem consists in determining the number & of roots of a 
real polynomial f(z) in the right half-plane (Rez > 0). 
To begin with, we treat the case where f(z) has no roots on the imaginary 
azis. In the right half-plane we construct the semicircle of radius R with 
its center at the origin and we consider the domain 
bounded by this semicircle and the segment of the imagi- 
nary axis (Fig.7). For sufficiently large F all the zeros 
of f(z) with positive real parts lie inside this domain. 
Therefore arg f(z) increases by 2kx on going in the 
positive direction along the contour of the domain.*? On 
the other hand, the increase of arg f(z) along the semi- 
circle of radius R for R— o is determined by the in- 
crease of the argument of the highest term a2" and is 
therefore nz. Hence the increase of arg f(z) along the 
Fig. 7 imaginary axis (R— oo) is given by the expression 


AS arg f (iw) = (n—2k) 2. (7) 


We introduce a somewhat unusual notation for the coefficients of f(z) ; 
namely, we set 


f (z) Sage” + bez"! + ayz™—* + Byz™ +--+ (ag AO). 
Then 
f (tw) =U (w) + 8V (w), (8) 
where for even n 


U (w) = (—1) 2 (agw*— a,w"—? +- aw" — + - +), 


; 


V (wm) =(—1) 27} (byw"—! — bw"—3 + byw" 5 — +++) 


and for odd n 
n—1 


U (w) = (— 1) 2 (bywo" == byw + byw" aan ‘), ” 
n—1 (8") 


V (w) = (— 1) 2. (agw" — ayw™—* + agw™ 4 — +--+), 


SS " 
11 For if f(2) = dy IT (e—e.), then A arg f(z) = 20 arg (2—2:). If the point 


=] 
e, lies inside the domain in question, then 4 arg (¢ —e.) = 2x; if 2. lies outside the 
domain, then 4 arg (e — a) = 0. 
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Following Routh, we make use of the Cauchy index. Then” 


7 Uo) 
ae: A. arg f (1) = aes V ‘w) 
| ~ —* D(a) 


(9) 
for lim Mi () =0. 

The equations (8’) and (8”) show that for even n the lower formula in 
(9) must be taken and for odd n, the upper. Then we easily obtain from 
(7), (8’), (8), and (9) that for every n (even or odd)® 

foo bywnr—1 — b,wnr—3 + -:- 
—? aywh — a,wn—2 +--+ 


=n—2k. (10) 


2. In order to determine the index on the left-hand side of (10) we use 
Sturm’s theorem (see the preceding section). We set 


hi (w) = aw" — Ay"? ao ; he (w) = byw" x2. bw “pices (11) 
and, following Routh, construct a generalized Sturm chain (see p. 176) 
hy (w), le (w), fs (w), ec eg hn (w). (12) 


by the Euclidean algorithm. 

First we consider the regular case: m=n-+1. In this case the degree 
of each function in (12) is one less than that of the preceding, and the last 
function f,(w) is of degree zero,** 

From Euclid’s algorithm (see (6) ) it follows that 


fs (w) = Rohs (w) — fy (w) = egw”? — €,"—* + egw" — +--+, 


where ‘ : ‘ 
a a,—a, a a.—~a, 
= 4— p= 1) a at (13) 
Similarly 
6, 
where 


b Cb, — doc i) Cob, — doe: e 
dg=b,— a= caer eee dj=b,— 7 a= one <i eee, (13’) 


The coefficients of the remaining polynomials f,(w),..., fa+1(w) are simi- 
larly determined. 


12 Since arg f (tw) = arccot ao =— arctan Ae : 


13 We recall that the formula (10) was derived under the assumption that f(¢) has no 
roots on the imaginary axis. 
14 In the regular case (12) is the ordinary (not generalized) Sturm chain. 
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Each polynomial Z . 
fi (@), fale), ++) fata (w) (14) 
is an even or an odd function and two adjacent polynom‘als always have 
opposite parity. 
We form the Routh scheme 


ao; a1; Qo, et ay 
bo, By bas oe, 
Cos (Cu “Ces cag (15) 


dy, dy, ade, ...; 


The formulas (13), (13’) show that every row in this scheme is deter- 
mined by the two preceding rows according to the following rule: 


From the numbers of the upper row we subtract the corresponding num- 
bers of the lower row multiplied by the number that makes the first differ- 
ence zero. Omitting this zero difference, we obtain the required row. 

The regular case is obviously characterized by the fact that the repeated 
application of this rule never yields a zero in the sequence 


Do, Cy, Ag, +++» 


Figs. 8 and 9 show the skeleton of Routh’s scheme for an even n (n= 6) 
ind an odd n (n=7). Here the elements of the scheme are indicated by 
lots. 

In the regular case, the polynomials f;(w) and fe(q@) have the greatest 
2ommon divisor f,,1(@) =—const. 0. Therefore these polynomials, and 
hence U(w) and V(w) (see (8), (8’), and (11)), do not vanish simul- 
taneously ; 1.e., f(tw) = U(w) +2V(m) 0 for real w. Therefore: In the 
regular case the formula (10) holds. 

When we apply Sturm’s theorem 
in the interval (— o, + o ) to the left- 
hand side of this formula ard make 
use of (14), we obtain by (10): 

V (— 0) — V(+ wo)=n—2k., (16) 
In our case’® 


V (+0) =V(dg, Bos Cor dp» «+ +) 
Fig.7 Fig. 8 
and 


15 The sign of f.(w) for w + co coincides with the sign of the highest coefficient 
and for » =— o differs from it by the factor (—1)"~*¥+1(k—1, 2,...,n+1). 
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V (— 0) = V (qo, — 5p; Co» —— do; oe. aye 


Hence 
V (—«o)=n—V (+0). (17) 


From (16) and (17) we find: 
k=V (aq; by; &9» do» «+ +): (18) 


Thus we have proved the following theorem: 


THEOREM 2 (Routh): The number of roots of the real polynomial f(z) 
in the right half-plane Rez < 0 1s equal to the number of variations of sign 
in the first column of Routh’s scheme. 


3. We consider the important special case where all the roots of f(z) have 
negative real parts (‘case of stability’). If in this case we construct for the 
polynomials (11) the generalized Sturm chain (14), then, since k =0, the 
formula (16) can be written as follows: 


V(— 07) — V(+ wo) = 2. (19) 


But OS V(— ©) Sm—1len and Of=V(t+ 0) Sm—1len. There 
fore (19) is possible only when m= n+ 1 (regular case!) and V(+ o) =0, 
V(— ©) =m—1=n. The formula (18) then implies: 


RovutH’s Criterion. All the roots of the real polynomial f(z) have 
megative real parts if and only wf in the carrying out of Routh’s algorithm 
all the elements of the ferst column of Routh’s scheme are different from 
zero and of like sign. 


4. In deriving Routh’s theorem we have made use of the formula (10). 
In what follows we shall have to generalize this formula. The formula (10) 
was deduced under the assumption that f(z) has no roots on the imaginary 
axis. We shall now show that in the general case, where the polynomial 
f(2) = age™ + bg"-) + a, 2%-2 4... (ao 0) has k roots in the right half- 
plane and s roots on the imaginary axis, the formula (10) is replaced by - 


{ 
tee by@n—1 — bwn—3 + bwn—5 ee -” 
Fans “an ook ae —=n—2k—s. (20) 


f(z) =d(z)f*(z), 


For 


where the real polynomial d(z) =25+... has s roots on the imaginary 
axis and the polynomial f*(z) of degree n* = 1 — s has no such roots. 
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For the sake of definiteness, we consider the case where s is even (the 
ease where s is odd is analyzed similarly). 
Let . 
f (to) = U (w) + tV (w) =d (tw) [U* (w) + tV" (w)]. 


Since in our case d(iw) is a real polynomial in w, we have 


U(w)  U* (w) 
V(w)  ~V*(w) ” 


Since n and n* have equal parity, we find by using (8’), (8), and the nota- 
tion (11): 

fa(w) __ fF (@) 

f,(@) f(a)" 


We apply formula (10) to f*(z). Therefore 


Seey..ceehe (oy Leste 
a el ee 


and this is what we had to prove. 


§ 4. The Singular Case. Examples 


1. In the preceding section we have examined the regular case where in 
Routh’s scheme none of the numbers Bo, Co, do, . . . vanish. 

We now proceed to deal with the singular cases, where among the num- 
bers bo, Co, .. . there occurs a zero, say, hp =0. Routh’s algorithm stops with 
the row in which hk, oceurs, because to obtain the numbers of the following 
row we would have to divide by ho. 

The singular cases can be of two types: 


1) In the row in which hy occurs there are numbers different from zero. 
This means that at some place of (12) the degree drops by more than one. 


2) All the numbers of the row wn which hy occurs vamsh simultaneously. 
Then this row is the (m+ 1)-th, where m is the number of terms in the 
generalized Sturm chain (12). In that case, the degrees of the functions in 
(12) decrease by unity from one function to the next, but the degree of 
the last function f,(@) is greater than zero. In both cases the number of 
functions in (12) ism < +1. 

Since tne ordinary Routh’s algorithm comes to an end in both cases, 
Routh gives a special rule for continuing the scheme in the cases 1), 2). 
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2. In case 1), according to Routh, we have to substitute for hp =0 a ‘small’ 
value « of definite (but arbitrary) sign and continue to fill in the scheme. 
Then the subsequent elements of the first column of the scheme are rational 
functions of e¢. The signs of these elements are determined bv the ‘small- 
ness’ and the sign of «. If any one of these elements vanishes identically in e, 
then we replace this element by another small value 7 and continue the 
algorithm. 
Example : 
f(2) =2* + 23 + 22? + 22 + 1. 


Routh’s scheme (with a small parameter ¢): 


lL, 2 1 
lL, 2 ; 
e 1 k=VQ,1,22——,1)=2. 
9+ 
é 
1 


This special method of varying the elements of the scheme is based on 
the following observation : 

Since we assume that there is no singularity of the second type, the 
functions f;(@) and fe(@) are relatively prime. Hence it follows that the 
polynomial f(z) has no roots on the imaginary axis. 

In Routh’s scheme all the elements are expressed rationally in terms of 
the elements of the first two rows, 1.e., the coefficients of the given poly- 
nomial. But it is not difficult to observe in the formulas (13), (13’) and 
the analogous formulas for the subsequent rows that, once we have given 
arbitrary values to the elements of any two adjacent rows of Routh’s scheme 
and to the first element of the preceding row, we can express all the elements 
in the first two rows, 1.e., the coefficients of the original polynomial, in 
integral rational form in terms of these elements. Thus, for example, all 
the numbers a, b can be represented as integral rational functions of 


A, Doy Coy - + + 5 Jor Jas Jay ++ +4 Moy Aa, Ira, . - 


Therefore, in replacing gy =0 by e we in fact modify our original poly. 
nomial. Instead of the scheme for f(z) we have the Routh scheme for.a 
polynomial F'(z, €), where F(z, €) is an integral rational function of z and ¢ 
which reduces to f(z) for e=0. Since the roots of F(z, e) change continu: 
ously with a change of the parameter « and since there are no roots on the 
imaginary axis for e = 0, the number & of roots in the right half-plane is the 
same for F(z,¢é) and F(z,0) =f(z) for values of ¢ of small modulus. 
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3,. Let us now proceed to a singularity of the second type. Suppose that 
in Routh’s scheme 


Go ~ 0, bo ~ 0, ..., Co FO, Go =0, 91 = 0, g2 =0,... 


In this case, the last polynomial in the generalized Sturm chain (16) is of 
the form: 
fm (w) == egeom—m+ 1 — e,qcyn—m—1 +4. --., 


Routh proposes to replace fm+1(w@), which is zero, by fn(w); ie., he 
proposes to write instead of go, gi, ... the corresponding coefficients 


(n—m+1)q, (n—m—l1) 4a, ... 
and to continue the algorithm. 


The logical basis for this rule is as follows: 


By formula (20) 


+eo f2(o) 
I~ f,(o) 


(the s roots of f(z) on the imaginary axis coincide with the real roots of 
fn(@)). Therefore, if these real roots are simple, then (see p. 174) 


—=n—2k—=s 


toe fm) 
~~ tm (w) =e 
and therefore 


foo fa) , proof (®) 

=" fa) eee 
This formula shows that the missing part of Routh’s scheme must be filled 
by the Routh scheme for the polynomials fn(w) and fm(w). The coeffi- 
cients of f,(w) are used to replace the elements of the zero row in Routh’s 
scheme. 

But if the roots of f,(w) are ‘not simple, then we denote by d(w) the 

greatest common divisor of f,,(w) and fn(@), by e(w) the greatest common 
divisor of d(w) and d’(w), etc., and we have: 


po d'(o) pm elo) 
+ Ine Gg) + I= e(a) + —= 8. 


~~ fin (@) 


Thus the required number k can be found if the missing parc of Routh’s 
scheme is filled by the Routh scheme for fm(w) and fm(w), then the scheme 
for d(w) and d’(w), then that for e(w) and e’(w), ete., ie, Routh’s rule 
has to be applied several times to dispose of a singularity of the second type. 
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Example. f(z) = 2! + 2— 28— 2274 24325 -+ 24—228—2? 4241. 


Scheme 
wo -}—] 1 1-1 1 
a ‘| —2 3 —2 #1 
w® 1—2 3 —2 1 
7f 8 —12 12 -4 
a { oe a re | 
w —l 3 —3 2 
sf 3 —3 38 
ut { 1—1l 1 k= V(1,1,1,2,—1,1,1,2,—L1) =4. 
jf 2-2 2 
at | iar 4 
4 —2 
a? | 2 —1 
wm? —l] 2 
ow 1 
2 
a? { 1 


Note. All the elements of any one row may be multiplied by one and 
the same number without changing the signs of the elements of the first 
eolumn. This remark has been used in constructing the scheme. 


4. However, the application of both rules of Routh does not enable us to 
determine the number & in all the cases. The application of the first rule 
(introduction of small parameters ¢«. ...) is justified only when f(z) has 
no roots on the imaginary axis. 

If f(z) has roots on the imaginary axis, then by varying the parameter s 
some of these roots may pass over into the right half-plane and change k. 


Ezample. f(z) =2+2+3244+322432374+ 2241. 


Scheme 
358 1 3 3 1 
w® | 3 2 | 
as € 1 1 
_l 

1 1 =2—— se = 
os g—3 » 1 Fe oa 2enr 

é é a i ? 

3— — 
wo? 1284 1 
a 

€ 
_ om 1 4 for e>0O 
a : V(i,1,e3——, 1, —é¢, 1) ={) for «<0. 
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The question of the value of k remains open. 
In the general case, where f(z) has roots on the imaginary axis, we have 
to proceed as follows: 


Setting f(z) = F, (2) +’Fe(z), where 
F(z) =ao2" + ay2"-24+..., F2(z) = bo2™—1 + bye*-2 +..., 


we must find the greatest common divisor d(z) of Fi(z) and F2(z). Then 
f(z) = d(z)f*(z). 

If f(z) has a root z for which —z is also a root (all the roots on the 
imaginary axis have this property), then it follows from f(z) =0 and 
f(— 2) =O that F(z) =0 and Fe(z) =0,i.e., zis a root of d(z). ‘Therefore 
f*(z) has no roots z for which —2z is also a root of f*(z). 

Then 

k= K, + ke, 


where k, and ke are the respective numbers of roots of f*(z) and d(z) in the 
right half-plane ; % is determined by Routh’s algorithm and ke = (q — s) /2, 
where gq is the degree of d(z) and-s the number of real roots of d(tw).1® 

In the last example, 


d(z)=22 +1, f*(z) =2t+ 294+ 227 + 22 +1. 
Therefore (see example on p. 182), we have kz =0, k, = 2, and hence 


k=2. 


§ 5. Lyapunov’s Theorem 


1. From the investigations of A. M. Lyapunov published in 1892 in his 
monograph ‘The General Problem of Stability of Motion’ there follows a 
theorem?’ that gives necessary and sufficient conditions for all the roots of 
the characteristic equation | AH — A |=0 of a real matrix A= | Quy. {| to 
have negative real parts. Since every polynomial 


f(A) = ad + aqd*—1 +... +a, (a) 0) 


16 d(s w) is a real polynomial or becomes one after cancelling ¢. The number of its real 
roots can be determined by Sturm’s theorem. 


37 Bee (32), § 20. 
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can be represented as a characteristic determinant | ,E — A |,’* Lyapunov’s 
theorem is of general character and is applicable to an arbitrary algebraic 


equation f(4) = 0. 


Suppose given a real matrix A = || a, ||; and a homogeneous polynomial 
of dimension m in the variables 21, Ze, ..., In! 
V (z, 2,.-., &) (# == (Wj; Bas vay. Bad) 
’ 
Let us find the total derivative with respect to ¢ of V(z, z,...,z) under 


the assumption that z is a solution of the differential system 


dz 
ae —=Azx. 


Then 
£V(q, ,..., 2)=V (Aa, xv, ..., 2) 
+V (a, Av,...,z)+°°°+V(z,2,..., Az) 
= W (8832454. 2); (21) 
where W(a, z,..., Z) is again a homogeneous polynomial of dimension m 


in 2, Zo,...,2n. The equation (21) defines a linear operator A which asso- 
ciates with every homogeneous polynomial of dimension m V(z, z,..., X) 


a certain homogeneous polynomial W(x. 7,..., 2) of the same dimension m 
W=A(YV). 
We restrict ourselves to the case m=2.1° Then V(2,2) and W(z,z) 
are quadratic forms in the variables 71, 22, ... , Z, connected by the equation 
d 
7 V (x, x) =V (Aa, x) + V(x, Ax) = W (2, x); (22) 
hence”® sas 
W=A(V)=A'V+ VA. (23) 
18 For this purpose it is sufficient to set, for example: 
6°0...6 = —* 
a 
1 0...90 a1 
Qo 
A= , 
00...1 —% 
Qo 


19 A. M. Lyapunov has proved his theorem se every positive integer m. 
20 Because V(z,y) = 2° Vy. 
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Here v=| Vix \|z and W= | Wik I) are symmetric matrices formed, 
respectively, from the coefficients of the forms V(z,x) and W(z,z). The 
linear operator A in the space of matrices of order 7 is completely deter- 
mined by specification of the matrix A= || a ||1. 

If 41, Ao, ..., An are the characteristic values of the matrix A, then every 
characteristic value of the operator A can be represenicd in the form 
Ay t+ Ax (1 =1, k = n).?! 

Therefore, if the matrix A = || au, fa has no zero characteristic value and 
no two that are opposites, then the operator A is non-singular. In this case 
the matrix W in (23) determines the matrix V uniquely. 

If V is symmetric, then the matrix W defined by (23) is also symmetric. 
If A is a non-singular operator, then the converse statement also holds: 
Every symmetric matrix W corresponds by (23) to a symmetric matrix V 
For in this case we find, by guing over to the transposed matrices on both 
sides of (23), that the matrix V", as well as V, satisfies (23). By the unique- 
ness of the solution, V'= V. 

Thus: If the matriz A= | Cx || has no zero and no two opposite char- 
acteristic values, then every quadratic form W(2, x) corresponds to one and 
only one quadratic form V(2,x) connected with W(x, x)by (22). 

Now we can formulate Lyapunov’s theorem. 


Tueorem 3 (Lyapunov): If all the characteristic values of the real 
matriz A = || ay, |i have negative real parts, then to every negatie-definite 
quadratic form W(x, xz) there corresponds a posttive-defimite quadratic form 
V(x, 2) connected with W (x, x)—taking 


dz 
into actount—by the equation 
d 
a V(%, #) = W (x, 2). (25) 


Conversely, if for every negatwe-definite form W(a,x) there exists a 
positive-defimite form V(x, x) connected with W(x, x) by the equation (25) 
—taking (24) into account—then all the gharacteristic values of the mairiz 
A= | On |i have negative real parts. 
Proof. 1. Suppose that all the characteristic values of A have negative 
eal parts. Then for every solution z= e4'z, of (24) we have im = 0.7? 


Suppose that the forms V(z, xz) and W(z, x) are connected by (25) and that 


21 See footnote 18. 
22 See Vol. I, Chapter V, § 6. 
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W(2,2) <0 (a@¥o).* 

Let us assume that for some 2 540 

Vi = V (29, %) SO. 

But £ V(az,2) = W(2,x7) <0 (x= e4a,). Therefore for ¢ > 0 the value 
of V(x, x) is negative and decreases for t > o, which results in a contradic- 
tion to the equation lim V (2, s)= lim V(z,%)=0. Therefore V(z,z) > 0 
for z ~0,i.e., V(x, x) is a positive-definite quadratic form. 

2. Suppose, conversely, that in (25) 

W(a,2) <0, V(a#,z) >0 (2540). 
From (25) it follows that 


| 
V (x, 2) =V (aq, %) + {W(x,x)dt (x=et'ay). (25') 
0 


We shall show that for every 2)<¢ o the column x = e“'z, comes arbitrarily 
near to zero for arbitrarily large values of ¢ > 0. Assume the contrary. 
Then there exists a number v > 0 such that 


W (a2, x)<—v<0 (x =e“"a,, wox%0, t>0). 
But then from (25’) 
V (x, 2) <V (aq, Xp) — Et, 
and so for sufficiently large values of t we have V(2, 2): < 0, which contra- 
dicts our assumption. 
From what we have shown, it follows that for certain sufficiently large 
values of ¢ the value of V(x, x) (2 = e4*z, 2 0) will be arbitrarily near 


to zero. But V(z,2z) decreases monotonically for ¢ > 0, since 5 V(z,2)= 
W(2,2)'<0. Therefore lim V(z, z) =0. 

Hence it follows that for every 2 0, lim “#9 = 0, i.€., lim e4t#= 0. 
This is only possible if all the characteristic vainea of A have negative real 
parts (see Vol. I, Chapter V, § 6). 


The theorem is now completely proved. 
For the form W(z, x) in Lyapunov’s theorem we can take any negative- 


n 
definite form, in particular, the form = Sy) a? In this case the theorem 
tol] - 


admits of the following matrix formulation: 


23 The form W(z,2z) is given arbitrarily. The form V(z,2z) is uniquely determined 
by (25), because 4 has in this case neither the characteristic value zero nor pairs of 
opposite characteristic values. 
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THEOREM 3’: All the characteristic values of the real matrix A= || au. iM 
have negative real parts if and only uf the matrix equation 
A'V+VA=—E (26) 


has as tts solution V the coefficient matrix of some positive-defimite quad- 
rateic form V(x, 2) > 0. 


2. From this theorem we derive a criterion for determining the stability of 
a non-linear system from its linear approximation.‘ 

Suppose that it is required to prove the asymptotic stability of the zero 
solution of the non-linear system of differential equations (1) (p. 172) in 
the case where the coefficients ay, (4,k—=1, 2,..., ~) in the linear terms 
on the right-hand side form a matrix A = | ix Ii having only characteristic 
values with negative real parts. Then, if we determine a positive-definite 
form V(z, x) by the matrix equation (26) and calculate its total derivative 
with respect to time under the assumption that += (2, re, ..., tn) iS a 
solution of the given system (1), we have: 


d Ld 
age V (2, 2) = a + R(ay, g,--., Lp), 
i= 
where #(21, Z2,..., Zn) is a series containing terms of the third and higher 
total degree in 21, z2,..., %. Therefore, in some sufficiently small neigh- 
borhood of (0, 0,...,0) we have simultaneously for every zo 


lg ge x) <0, 


V (x, z)>0, at 


By Lyapunov’s general criterion of stability*® this also indicates the 
asymptotic stability of the zero solution of the system of differential equa- 
tions. 

If we express the elements of V from the matrix equation (26) in terms 
of the elements of A and substitute these expressions in the inequalities 


Uy Un ++ - Vy 
11. V9 v Von .a. V 
V4, >0, ° >9. ow ey 21 22 an >0, 
v, v, . oe e oT e e e 
21 Vee 
Unt Ung Van 


then we obtain the inequalities that the elements of a matrix 1= || ay, [I 
must satisfy in order that all the characteristic values of the matrix should 


24 See [32], § 26; [9], pp. 113 ff.; [36], pp. 66 ff. 
25 Bee [32], § 16; [9], pp. 19-21 and 31-33; [36], pp. 32-34. 
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have negative real parts. However, these inequalities can be obtained in a 
considerably simpler form from the criterion of Routh-Hurwitz, which will 
be discussed in the following section. 

Note. Lyapunov’s theorem (3) or (3’) can be generalized immediately 
to the case of an arbitrary compler matrix A= || au |r - The quadratic 
forms V(z,xz) and W(z,2) are then replaced by Hermitian forms 


n 


V (x, 2) = D'oyze,, W(x, 2) = 3) wyray. 


t,kol (,k=—l 


Correspondingly, the matrix equation (26) is replaced by the equation 


A'V+VA=—E (A° =A’), 


§ 6. The Theorem of Routh-Hurwitz 


I. In the preceding sections we have explained the method of Routh, un- 
surpassed in its simplicity, of determining the number k of roots in the right 
half-plane of a real polynomial whose coefficients are given as explicit 
numbers. If the coefficients of the polynomial depend on parameters and 
it is required to determine for what values of the parameters the number k 
has one value or another—in particular, the value 0 (‘domain of stability’)*° 
—then it is desirable to have explicit expressions for the values of co, do, . . . 
in terms of the coefficients of the given polynomial. In solving this problem, 
we obtain a method of determining k and, in particular, a stability criterion 
in a form’in which it was established by Hurwitz [204]. 
We again consider the polynomial 


f(z) = agz* + doz?! + ay2z™—* + byz™ 3 +-+-  (ag540). 


By the Hurwitz matriz we mean the square matrix of order n 


bo b, bs cee bn-1 


0 bo b, eer b, 9 ay = 0 for k > EE 
H=|\0 Ay ay Qn_2 | n—] (27) 
0 0 bo b, 3 b, = 0 for k > 2 


ee je @ e& # oe @e @ 


ee e ee ee © # # ee @ 


26 For this is precisely the situation in planning new mechanical or electrical systems 


of governors. 
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We transform the matrix by subtracting from the second, fourth, ... 
rows the first, third, ... row, multiplied by ao/bo.27. We obtain the matrix 


0 Ca: Cy. een. “Os 
10 by by ... Bf 
: 0 Cy i+ Cys 
0 0 by ... bs 


In this matrix co, c1,... is the third row of Routh’s scheme supplemented by 


zeros (c,=0 for k > [n/2] —1). 
We transform this matrix again by subtracting from the third, fifth, ... 


jby by by ..- : 


rows the second, fourth, ... row, multiplied by bo/co: 
re | 
QO Cy Cy Cy 
0 0d ad, 
0 0 & ¢& 
0 0 0 d 
0 0 0 & 


oo ee #  @¢  @  @®  e@  @®  « 


Continuing this process, we ultimately arrive at a triangular matrix of 
order n 


by by by | 
O: 9° Ce BSs 
R=|/0 0 &...||- (28) 


which we call the Routh matriz. It is obtained from Routh’s scheme (see 
(15)) by: 1) deleting the first row; 2) shifting the rows to the right so that 
their first elements come to lie on the main diagonal; and 3) completing it 
by zeros to a square matrix of order n. 


2? We begin by dealing with the regular case where b) + 0, & ~U, do 0, ..... 
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DEFINITION 2: Two matrices A= || an || and B= || by ||t will be called 
equivalent if and only tf for every p = n the corresponding minors of order 
p in the first p rows are equal: 


a(; ele es je pe ie aa 
a p=—1,2,...,%” 


t, tg... t 

Since we do not change the values of the minors of order p in the first 

p vows when we subtract from any row of the matrix an arbitrary multiple 
of any preceding row, the Hurwitz and Routh matrices H and R are equiva- 


lent in the sense of Definition 2: 


al; 2 ses ae Dae | & tai any By Lye yeas} ") (29) 


ty tg .ee Ut» oT tg... ty p=, 2, covey % 


The equivalence of the matrices H and R enables us to express all the 
elements of R, i.e. of the Routh scheme, in terms of the minors of the Hurwitz 
matrix H and, therefore, in terms of the coefficients of the given polynomial. 
For when we give to p in (29) the values 1, 2, 3, ... in succession, we obtain 


H poe b H : b H aie b 
1) 0% 13) ay Oar , (80) 
1 2 23 123 
H -s =Dotody, H 124 = betod,, co = boloda, 
ete. 


Hence we find the following expressions for the elements of Routh’s 
scheme: . 


a) a al 
H(;) #()) #()) (31) 
123 123 

: A ) yates) thes) | 
H(t 3) H( 5) a(r 9) 
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The successive principal minors of H are usually called the Hurwitz 
determinants. We shall denote them by 


i 12\ |b b 
4,=H(1)=b 4,=H{ Se cele 
1 12 Gy a 
by by ee. Daa 
1 2 n ay Ay-- a,-1 
4,=H() : "|= 0 by.-- Dag]: (32) 


Note 1. By the formulas (30) ,” 


A,=b, Abel 43 =Oo%dp, ---- (33) 


From 4,0, ..., 4,50 it follows that the first p of the numbers 
Do, Co,... are different from zero, and vice versa ; in this case the p successive 
rows of Routh’s scheme beginning with the third are completely determined 
and the formulas (31) hold for them. 

Note 2. The regular case (all the bo, co, ... have a meaning and are 
different from zero) is characterized by the inequalities 


Ai ~0, 4e%0, ..., 4,0. 


Note 3. The definition of the elements of Routh’s scheme by means of 
the formulas (31) is more general than that by means of Routh’s algorithm. 


Thus, for example, if bp. = H ( 


anything except the first two rows formed from the coefficients of the given 
polynomial. However if for 4;—=0 the remaining determinants A,, A;,... 
are different from zero, then by omitting the row of c’s we can determine 
by means of the formulas (31) all the remaining rows of Routh’s scheme. 

By the formulas (33), 


\= 0, then Routh’s algorithm does not give us 


by = A, “o= 4,’ ~~ Me’ eee 
and therefore 
28 Tf the coefficients of f(z) are given numerically, then the formulas (33)—reducing 


this computation, as they do, to the formation of the Routh scheme—give by far the 
timplest method for computing the Hurwitz determinants. 
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A A,, an 
V (dq,b9, Co». -) =V (ao, Aya ee ey T)=V (ay A, Ay,...)+ V (1, As, Mg sceaia)e 
Hence Routh’s theorem can be restated as follows: 


THEOREM 4 ( Routh-Hurwitz) - The number of real roots of the poly- 
nomial f(z) =ao2" -+... inthe right half-plane 1s determined by the formula 


= , 4 A An 
or (what is the same) by 
k = V (2p; A,, As, oe ) + V (1, A,, A, ee i (34’) 


Note. This statement of the Routh-Hurwitz theorem assumes that we 
have the regular case 


A, ~0, Ae ~ 0, ee A, 0. 


In the following section we shall show how this formula can be used in 
the singular cases where some of the Hurwitz determinants A, are zero. 


2. We now consider the special case where all the roots of f(z) are in the 
left half-plane Rez <0. By Routh’s criterion, all the ao, bo, Co, do, . . . must 
then be different from zero and of like sign. Since we are concerned here 
with the regular case, we obtain from (34) for k = 0 the following criterion - 


CRITFRION OF RoutH-Hurwitz: All the roots of the real polynomial 
f(z) =ao2" +... (Qo 0) have negative real parts if and only tf the in- 
equalitres 
a,A,>0 (for odd n), 

@4,>0, Ay>0, a54,>0, Ay>0,..., °." —} (36 
ore is : A, >0 (for even n) ao 
hold. 


Note. If a > 0, these conditions can be written as follows: 
A,>0,' 4,>0,..., 4,>0. (36) 
If we use the usual notation for the coefficients of the polynomial 
f (2) = age" ayz"™ + ag@™* + m+ + Oy 42 + Ms 


then for a) > 0 the Routh-Hurwitz conditions (36) can be written in the 
form of the following determinantal inequalities: 
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a, a3 as eee 0 
ao ao a, hes 0 
| @y @, a, 
a, as 0 a, a, ... O , 
>0, | 2 >0, ja, a, «|>0, ..., ane >0. (36) 
ja, | Go meh o 4g % 0 ay a, ... 0 


0 a@ a; 


A real polynomial f(z) =a,.2"* +... whose coefficients satisfy (35), 1.e., 
whose roots have negative real parts, is often called a Hurwitz polynomial. 


3. In conclusion, we mention a remarkable property of Routh’s scheme. 


Let fo, f1, ... and go, gi,... be the (m + 1)-th and (m + 2)-th rows of the 
scheme (fo= 4m/4Mm—1, Jo= 4m41/4m). Since these two rows together 
with the subsequent rows form a Routh scheme of their own, the elements 
of the (m+p+1)-th row (of the original scheme) can be expressed in 
terms of the elements of the (m+ 1)-th and (m+ 2)-th rows fo, fi, ... and 
Jo, 91, .-- by the same formulas as the (p + 1)-th row can in terms of the 
elements of the first two rows do, a3, ... and bo, bi, ... ; that is, if we set 


Jo 9: Gs +> 


ho h, fs 
eee ee Os * e 
H=' 0 1 ; 
Jon A: | 
| || 
then we nave 
ee os m+ op ) a ee a Pp 
1...m+p—1 m+p+k—1} _ 1 p—\ p+k—l 
ieee _ ease aes (37) 
eee eae 1 ae 


The Hurwitz determinant 4,,,, is equal to the product of the first m + p 
numbers in the sequence bo, Co, ... : 


An+p = doy -++ fon +++ by 
But 
Am = boty sare Dox Ay =o «++ by 


[Therefore the following important relation?® holds: 


An+p — Amd» ° (38) 


29 Here Ap is the minor of order =p in the top left-hand corner of H. 
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The formula (38) holds whenever the numbers fo, f;, ... and go, gi, ... 
are well defined, i.e., under the conditions 4,,-1 0, 4m <0. 

The formula (37) has a meaning if in addition to the conditions 4,,_, 40, 
Am 3 0 we also have An45-.1. 540. From this condition it follows that the 
denominator of the fraction on the right-hand side of (37) is also different 
from zero: A,_1 5 0. 


§ 7. Orlando’s Formula 


1. In the discussion of the cases where some of the Hurwitz determinants 
are zero we shall have to use the following formula of Orlando [294], which 
expresses the determinant 4,_, in terms of the highest coefficient a) and the 
roots 21, Z2,.-., &n of f(z) :*° 


n(n—J) 1, n 


A4i1=(-1) * am IT (2,42). (39) 
(<k 
For n= 2 this reduces to the well-known formula for the coefficient 6, 
in the quadratic equation ao2” + boz + a, = 0: 
A, = bg= — Gg (24 + 2y)- 


Let us assume that the formula (39) is true for polynomials of degree n, 
f(z) = ag" + bye} +--+ and show that it is then true for polynomials of 
degree n + 1 


F (z) = (z+ A) f (2) : ; 
= agz"*1 + (by + hag) 2" + (a, + hbo) zr 4 eee (h =— 2n41)- 


For this purpose we form the auxiliary determinant of order » + 1 


a,—0 for k> A ’ 
*F-| 


Pe ee er ee er ee ee | 


b,=0 for t>| 5 


>» 0©« «© &®& #® je«* e# e© ee e&* @ @ 


30 The coefficients of f(z) may be arbitrary complex numbers. 
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We multiply the first row of D by a and add to it the second row multi- 
plied by — bo, the third multiplied by a, the fourth by — }b;, ete. Then 
in the first row all the elements except the last are zero, and the last element 
is f(h). Hence we deduce that 


D=(—1)" A,_jf (A). 
On the other hand, when we add to each row of D (except the last) the 


next multiplied by h we obtain, apart from a factor (— 1)", the Hurwitz 
determinant 4,, of order n for the polynomial F(z) : 


bo thay b,+ha,... 


Ao a,+hb,... 
a 0 bp + hay... Per 
D=(—1) 0 oe ° =(—I1) A’. 


2° © * ee ee «© j#  e* ee ee @ 


Thus 
Ay = Agesf (h) =a 4,_1 TT (h—2). 
i=] 
When we replace A,_1 by its expression (39) and set h = — 2,11, we obtain 
(n+ 1jn 1,..,2+1 
Av=(-1) ® at TT (+4). 
t<k 


Thus, by mathematical induction Orlando’s formula is established for 
polynomials of every degree. 

From Orlando’s formula it follows that: 4,_;=0 if and only if the sum 
of two roots of f(z) 18 zero.** 

Since 4, =cA,-1, where c is the constant term of the polynomial f(z) 
(c= (—1)"@p2122...2n), it follows from -(39) that: 


n(n+1) 1 n 


A,=(—]). 2  a®zyzg-++Z, i, (z, + 2). (40) 
t< 


The last formula shows that: 4, vanishes if and only tf f(z) has a pair 
of opposite roots z and — 2. 


31 In particular, 4,_, == 0 when f(z) has at least one pair of conjugate pure imaginary 
roots or multiple zero roots. 
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§ 8. Singular Cases in the Routh-Hurwitz Theorem 


In discussing the singular cases where some of the Hurwitz determinants 
are zero, we may assume that 4, 0 (and consequently 4,_; 540). 

For if 4, = 0, then, as we have seen at the end of the preceding section, 
the real polynomial f(z) has a root 2’ for which — z’ is also a root. If we set 
f(z) = F(z) + Fo(z). where 

F, (2) =ag2" + ayz"-2 4 +++, By (z) = bgz™—-! + 5,297 +--+, 
then we can deduce from f(z’) =f(—2’) =0 that F,(2’) = Fe(2’) =0. 
Therefore z’ is a root of the greatest common divisor d(z) of the polynomials 
F,(z) and Fe(z). Setting f(z) =d(z)f*(z), we reduce the Routh-Hurwitz 
problem for f(z) to that for the polynomial f*(z) for which the last Hurwitz 
determinant is different from zero. 


1. To begin with, we examine the case where 


Ay =...= A,=0, Apy1 #0, ..., 4,50. (41) 


From 4;=—0 it follows that }.=0; from 42= : =— Ab, = 0 it 
0 1 
follows that }; =0. But then we have automatically 
0 b, Bb, 
A,y==|% @, a,| = —a,.b} = 0. 
00 3A, 
From 
0 0 b Bs 
—|% 4% % Asi _ 
A= 196. 0B, 
0 a a 4, 
it follows that 6,=0 and then A, = — a2b3 = 0, ete. 
This argument shows that in (41) p is always an odd number p = 2h — 1. 
Then b, = b,=bo.=. coe =b,_-1=0, b, = 0, and*? 
h(a+1) h(h+1) 


Api =A =(—1) 7 afbh, Ayre = Aorta =(— 1)? abt *=Apirdy. (42) 


Let us vary the coefficients bo, b;,..., b,—1 in such a way that for the 
new, slightly altered values bo*, b1*,..., ba_1 all the Hurwitz determinants 
A,*, A*,..., 4n* become different from zero and A,41,..., 4n* keep their 
previous signs. We shall take b,*, b1*,..., 0441 a8 ‘small’ values of differ- 
ent orders of ‘smaliness’; indeed, we shall assume that every b;~1 is in abso- 

hyl 

32 From (42) it follows that for odd h sign An+2= (—1) = sign a, and for even 

A 


he sign 4,4,= (—1)?. 
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lute value ‘considerably’ smaller than b,* (j7=1, 2,...,h; b,*=b,). The 
latter means that in computing the sign of an integral algebraic expression 
in the-d,* we can neglect terms in which some },* have an index less than 
7 in comparison with terms where all the b;* have an index at least 7. 
We can then easily find the ‘sign-determining’ terms of A}, 43,..., Af 
(p= 2h —1) 8 


Ay = bo» Ag=—49b, +e, Ap=—agby? + +++, Ap—— athe? 4..., 


ARABS to, Ago + -, 
etc.; in general, ae 
ij 1) 
*_ 2 fp? (— 
4,,=(—1) atb,! + Vee g=1, 2, oes h—1), (43) 
7G+1) 
Ajj =(—l) 7 afatti+... (j=0,1, ..., h—1). 
We choose 6,*, b;*,...., Dex: as positive; then the sign of A,” is determined 
by the formula 
i042) : 
sign A;-—=(—1) ,* signal (i=[5].- cd bree ?). (44) 


In any small variation of the voefficients of the polynomial the number 
k remains unchanged, because f(z) has no roots on the imaginary axis. 
Therefore, starting from (44) we determine the number of roots in the 
right half-plane by the formula 


, A A A A 
k =v (4 Ae BP, ke ee, se) 4 v (4r Leung “|, (45) 
‘ A, p Ape ie 


An elementary calculation based on (42) and (44) shows that 


° ‘ p—2h—1 
Via, 4; ae ess maey 4542 —h+ 1—(—ipe : Ante (46) 
aw aA’ A 2 e= sign (ay 5° 
1 P P+1 p+1 


Note that the value on the left-hand side of (46) does not depend on the 
method of varying the coefficients and retains one and the same sign for 
wbitrary small variations. This follows from (45), because k does not 
*hange its value under small variations of the coefficients. 


33 Essentially the same terms have already been computed above for 4,, 4,,..., 4,. 


900 XV. THE PROBLEM orf RouTH-HuRWITZ AND RELATED QUESTIONS 
2. Suppose now that for s > 0 


Aggy 8 = Aap = 9 (47) 


and that all the remaming Hurwitz determinants are different from zero. 

We denote by a, a, ... and bo, bi, ... the elements of the (s + 1)-th 
rows in Routh’s scheme (4 = A,/As-1, bo = 4g41/A;). We denote the cor- 
responding determinants by Aj, Ao, ..., Oa—s By formula (38) (p. 195), 


Ai41 = A.A, weey Agt yp = A,Ap, Astp+i = AApry Astpt2 = A,4p+2- (48) 


Then by 1. it follows that p is odd, say p = 2h — 1.** 

Let us vary the coefficients of f(z) in such a way that all the Hurwitz 
determinants become different from zero and that those that were different 
from zero before the variation retain their sign. Since the formula (46) is 
applicable to the determinants A, we then obtain, starting from (48): 


( 4, Ana Sates seats 
A,_1 A, A,ip As p41 
p—2h—l1, 
=f qs ( ; A, Asrpoa\ }> (49) 
2 €= sign (52 aries) 
3-1 Astpt1 
= Ay A, Arai ao) (Gores A, 
L=V (ay, 4 a a5) (qt. pon Gree 7 GME a 2). 


The value on the left-hand side of (49) again does not depend on the method 
of variation. 


3. Finally, let us assume that among the Hurwitz determinants there are 
» groups of zero determinants. We shall show that for every such group 
(47) the value on the left-hand side of (49) does not depend on the method 
of variation and is determined by that formula.** We have proved this 
statement for v=1. Let us assume that it is true for y—1 groups and 
then show that it is also true for y groups. Suppose that (47) is the second 
of the v groups; we determine 4), A. in the same way as was done under 2. ; 
then for this variation 


——— 


34 In accordance with footnote 32, for p= 2k —1 and odd h, 


| b+ 
sign 4s494+2== (—1) 2 signA,_3; 


and for even h, 
a 
sign As + »41== (—1)2 sign A,. 
35 From (47) and 4; £0,454 941 0 it follows by (48) and (42) that 4,_, % ¢. 
As+p+2 % 0. 
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Aa A, ae Ay 
V ( . | ae 2 a = V ¢ r A  @©@ ay 7) e 
As_y 4y-1 ° , An—s—1 


Since we have only y — 1 groups of zero determinants on the right-hand side 
of this equation, our statement holds for the right-hand side and hence for 
the left-hand side of the equation. In other words, the formula (49) holds 


for the second, ..., v-th group of zero Hurwitz determinants. But then it 
follows from the formula 
t 
k=V (05, 45, Ay eeeg An ) 
A, © Ans 
that the value of V fs art, se sins {rtp+2) does not depend on the 


method of variation for the first group of zero determinants, and therefore 
that (49) holds for this group as well. 
Thus we have proved the following theorem : 


TuroremM 5: If some of the Hurwitz determinants are zero, but A, 0, 
then the number of roots of the real polynomial f(z) wm the right half-plane 
as determined by the formula 


A A 
Wtf A) 


an which for the calculation of the value of V for every group of p successive 
zero determinants (p is always odd!) 


(Ay #0) 4541 = 2 = Aepp=9 (As4p41 ¥ 0) 
we have to set 


A, A A 1— (— I1)4e 
y ( 2 qt?) = =A + 50 

Ay, 4, ” Asep+i 2 (50) 
where*® 


A, Fale 


=2h—1 and = sign (5+ 
P : 8 As—1 As4-p+1 


§9. The Method of Quadratic Forms. Determination of the 
Number of Distinct Real Roots of a Polynomial 


Routh obtained his algorithm by applying Sturm’s theorem to the computa- 
tion of the Cauchy index of a regular rational fraction of special type (see 
formula (10) on p. 178). Of the two polynomials in this fraction—numera- 


86 For s==1 5 ; is to be replaced by 4,; and for s= 0, by ao. 
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tor and denominator—one contains only even, the other only odd powers of 
the argument z. 
In this and in the following sections we shall explain the deeper and 
more comprehensive method of quadratic forms, due to Hermite, in its 
application to the Routh-Hurwitz.problem. By means of this method we 
shall obtain an expression for the index of an arbitrary rational fraction 
in terms of the coefficients of the numerator and denominator. The method 
of quadratic forms enables us to apply the results of Frobenius’ subtle inves- 
tigations in the theory of Hankel forms (Vol. I, Chapter X, § 10) to the 
Routh-Hurwitz problem and to establish a close connection of certain re- 
markable theorems of Chebyshev and Markov with the problem of stability. 


1. We shall acquaint the reader with the method of quadratic forms first 
in the comparatively simple problem of determining the number of distinct 
real roots of a polynomial. 

In the solution of this problem we may restrict ourselves to the case 
where f(z) is a real polynomial. For suppose that f(z) =«(z) + w(z) is 
a complex polynomial (u(z) and v(z) being real polynomials). Each real 
root of f(z) makes u(z) and v(z) vanish simultaneously. Therefore the 
complex polynomial f(z) has the same real roots as the real polynomial d(z), 
the greatest common divisor of u(z) and v(z). 

Thus, let f(z) be a real polynomial with the distinct roots a;, a2, ..., ag 
of the respective multiplicities 1, m2, ..., *¢: 


f (2) = ag (2 — @)™ (2 — aq)" +++ (2—a,) 4 
(40; axa, for ik; i,k=1,2,...,9). 


We introduce Newton’s sums 


With these sums we form the Hankel forms 


n—l1 


8, (@, 2) = ke Seset 
Oe, ad 


where » is an arbitrary integer, n > g. 

Then the following theorem holds: 

THEOREM 6: The number of all the distinct roots of f(z) 1s equal ta the 
rank, and the number of all the distinct real roots to the signature, of the 
form S,,(2, x). 
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Proof. From the definition of the form S,(z, x) we immediately obtain 
the following representation : 
qg 
S,, (x, x) =2 nN, (Xp + &2y + a5, foveot a; xe -',)* (51) 
i= 
Here to each root a; of f(z) there corresponds the square of a linear form 
Ly Xo + aya +... + at *Sy_1 (G=1,2,..., q). The forms Z,, Zo,...,4q 
are linearly independent, since their coefficients form the Vandermonde 
matrix || a,* || whose rank is equal to the number of distinct ay, i.e., to q. 
Therefore (see Vol. I, p. 297) the rank of the form S,,(z, x) is q. 
In the representation (51) to each real root a, there corresponds a posi- 
tive square. To each pair of conjagate complex roots a, and a, there cor- 
respond two complex conjugate forms: 


Z;=P,+iQ;, Z,=P,—iQ;; 


the corresponding terms in (51) together give one positive and one negative 
square : 

mf; + ny; = 2n,P,— 2n,Q;. 
Hence it is easy to see®’ that the signature of S,(z, x), ie., the difference 
between the number of positive and negative squares, is equal to the number 
of distinct reala;. | 

This proves the theorem. 

2. Using the rule for determining the signature of a quadratic form that 
we established in Chapter X (Vol. I, p. 303), we obtain from the theorem the 
following corollary: 

CoroLtuaRy: The number of distinct real roots of the real polynomial 
f(z) 4s equal to the excess of permanences of sign over variations of sign in 
the sequence 

| 0 81 .+- Syiy 
81 89 cee 8, ( 52) 


Sn-1 §_ +++ S2n-2 


where the s, (p =0, 1, ...) are Newton’s sums for f(z) and n ts any integer 
not less than the number q of distinct roots of f(z) (in particular, n can be 
chosen as the degree of f(z)). 


37 The quadratic form S,(2, a) is representable as an (algebraic) sum of gq squares of 
the rea] forms Z; (for reala;) and P; and Q; (for complex a). These forms are linearly 
independent, since the rank of 8,(2%, x) ia q. 
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This rule for determining the number of distinct real roots is directly 
applicable only when all the numbers in (52) are different from zero. How- 
ever, since we deal here with the computation of the signature of a Hankel 
form, by the results of Vol. I, Chapter X, § 10, the rule with proper refine- 
ments remains valid in the genera. case (for further details see § 11 of that 
chapter). 

From our theorem it follows that: Ali the forms 


S, (2,27): (n= 9,q¢4+1,...) 


have the same rank and the same signature. 

In applying Theorem 6 (or its corollary) to determine the number of 
distinct real roots, we may take n to be the degree of f(z). 

The number of distinct real roots of the real polynomial f(z) is equal to 


the index J" ” Fe) (see p. 175). Therefore the corollary to Theorem 6 gives 


— f(z) 
the formula 
89 8} . « 
ica i =n—2V{ 1, 5,/0° 2 ru Msaeeee thd 
1 4 3 cae 
Sn_1 8, +++ Son_2 


9 
where s,= 3’ n,a5 (p=O0, 1, ...) are Newton’s sums and n is the degree 


of f(z). 

In § 11 we shall establish a similar formula for the index of an arbitrary 
rational fraction. The information on infinite Hankel matrices that will be 
required for this purpose will be given in the next section. 


§ 10. Infinite Hankel Matrices of Finite Rank 


1. Let 

So, $1, Sa, -.. 
be a sequence of complex numbers. This determines an infinite symmetric 
matrix 

89 81 84 eoeoe 


Be. Be Be aces 
sS= 1 2 “3 
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which is usually called a Hankel matrix. Together with the infinite Hankel 
matrices we shall consider** the finite Hankel matrices 8, = || 44x || 37> 
and their associated Hankel forms 


na—l 
S, (z, 2) = ee Sette: 


The successive principal minors of S will be denoted by D,, D2, Ds, ... 


D,=\8i5 + (p=1,2,...). 


Infinite matrices may be of finite or of infinite rank. In the latter case, 
the matrices have non-zero minors of arbitrarily large order. The following 
theorem gives a necessary and sufficient condition for a sequence of numbers 


So, $1, Sz, ... to generate an infinite Hankel matrix S= | Sink | = of finite 
rank. 
THEOREM 7: The infinite matric S= | Si+k ||? ts of finite rank r if and 
only if there exist r numbers ay, ae, ..., ar such that 
r 
t= D> yg (q=r,r+1,...) (53) 
gu 
and r ws the least number having this property. 
Proof. If the matrix S= | Sinn | © has finite rank r, then its first 
r+1 rows R;, Re ..., R-+, are linearly dependent. Therefore there exists 


a number h = rsuch that RA, Re, ..., Ry are linearly independent and Ry 4; 
is a linear combination of them: 


h 
Rigi = p> Ca re 


We consider the rows Ryi1, Roy2, ---, Rointi, where q is any non- 
negative integer. From the structure of § it is immediately clear that the 
rows Rais, Rg+2, .--, Rgtn41 are obtained from R,, Ro, ..., Rass by a 


‘shortening’ process in which the elements in the first q columns are omitted. 
Therefore 


A 
Rosas =<) Ot Rorrgtr (=, 1, 2, ...). 


Thus, every row of 9S beginning with the (h + 1)-th can be expressed linearly 
in terms of the h preceding rows and therefore in terms of the linearly 


88 See Vol. I, Chapter X, § 10. 
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independent first h rows. Hence it follows that the rank of § is r=A,* 
The linear dependence 
h 
Rota =m, ay Rosrpts 


after replacement of A by r and written in more convenient notation 
yields (53). 

Conversely, if (53) holds, then every row (column) of S is a linear com- 
bination of the first + rows (columns). Therefore all the minors of S whose 
orders exceed r are zero and S is of rank at most r. But the rank cannot be 
less than r, since then, as we have already shown, there would be relations 
of the form (53) with a smaller value than r, and this contradicts the second 
condition of the theorem. The proof of the theorem is now complete. 


Corotuary: If the infinite Hankel matric S= || si+x lo is of finite 
rank r, then 
D, =| 842 0 0. 
For it follows from the relations (53) that every row (columh) of S is 
a linear combination of the first r rows (columns). Therefore every minor 


of § of order r can be represented in the form aD,, where a is a constant. 
Hence it follows that D, 0. 


Note. For finite Hankel matrices of rank r the inequality. D, 40 need 


& 
not hold. For example S2= in se for s) = $s, =0, so Ois of rank 1, 


_ whereas D; = 8s, = 0. 


2. We shall now explain certain remarkable connections between infinite 
Hankel matrices and rational functions. 
Let 


be a proper rational fractional function, where 


h (2) =age™ +--+ Gm (ag x0), 9 (2) = by? + bye $+ + Dey 
We write the expansion of R(z) in a power series of negative powers of z: 


— 9@) _% , 4, 8 
eas 

39 The statement ‘The number of linearly independent rows in a rectangular matrix is 
equal to its rank’ is true not only for finite rows but also for infinite rows. 
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If all the poles of #(z),i.e., al} the values of z for which R(z) becomes infinite, 
lie in the circle |2|a, then the series on the right-hand side of the 
expansion converges for |z| >a. We multiply both sides by the denomi- 
nator h(z): 


(agz™ + az”! + +++ + a) (+4434 vss) = ba) + bz 2 4 nid -+- Boss 


Equating coefficients of equal powers of z on both sides of this identity, 
we obtain the following system of relations: 


AgBo= by, 
Bn + 218, Do, 
ee (54) 
BFm—1 + Ty8m—e + °°* + Aqy_189 = 5p, 
By8q + Ay8y_4 + °° + nF, =O (q=m,m+1,...). (54’) 


Setting 
wa=— 2 (g=1,2,..., m), 
we can write the relations (54’) in the form (53) (for r==m). Therefore, 
by Theorem 7, the infinite Hankel matrix 


S=leuslle 


formed from the coefficients so, 51, Se,... is of finite rank (<= m). 

Conversely, if the matrix § = | Six ||,” is of finite rank r, then the rela- 
tions (53) hold, which can be written in the form (54’) (for m=r). Then, 
when we define the numbers },, ba, ..., ¥m by the equations (54) we have the 
expansion. 


bz +++ t+bm 
ag agT pba, 2 tT 
The least degree of the denominator m for which this expansion holds is 
the same as the least integer m for which the relations (53) hold. By Theo- 
rem 7, this least value of m is the rank of S= || si+x ||. 


Thus we have proved the following theorem: 


ToeoreM 8: The matric S= || 54x ||,° 1s of fimte rank if and only tf 
the sum of the series 


R@g=2p+ap Ay... 


a3 @ rational function of z. In this case the rank of S is the same as the 
number of poles of R(z), counting each pole with tts proper multiplicity. 
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§ 11. Determination of the Index of an Arbitrary Rational Fraction 
by the Coefficients of Numerator and Denominator 


1. Suppose given a rational function. We write its expansion in a series 
of descending powers of z:*° 


R(2)=8_yamuter tse tegt Staten. (55) 
The sequence of coefficients of the negative powers of z 


Sq, 81, Se) eae 


determines an infinite Hankel matrix S = | So4k |: 
We have thus established a correspondence 
R(z) ~ 8. 


Obviously two rational functions whose difference is an integral func- 
tion correspond to one and the same matrix S. However, not every matrix 
S=|| si+x ||,° corresponds to some rational function. In the preceding 
section we have seen that an infinite matrix S corresponds to a rational 
function if and only if it is of finite rank. This rank is equal to the number 
of poles of R(z) (multiplicities taken into account), ie., to the degree of 
the denominator f(z) in the reduced fraction g(z)/f(z) = R(z). By means 
of the expansion (55) we have a one-to-one corespondence between proper 
rational functions R(z) and Hankel matrices S = | Star ie of finite rank. 

We mention some properties of the correspondence: 


1. If Ry(z) ~ 81, Re(z) ~ Se, then for arbitrary numbers ¢;, Ce 
eR; (2) + coR2(z) m~ (183 + Ceo. 


In what follows we shall have to deal with the case where the coefficients 
of the numerator and the denominator of R(z) are integral rational functions 
of a parameter a; Ff is then a rational function of zanda. From the expans 
sion (54) it follows that in this case the numbers $9, 81, S2, ..., 1.e., the ele- 
ments of S, depend rationally on a. Differentiating (55) term by term 
with respect to a, we obtain: 


. aR 
2. If R(z,a) ~ S(a), then ae ~s0 


40 The series (55) converges outside every circle: (with center at z—0) containing all 
the poles of R(z). 
8 a. 
41 If S= || ac+e ||”, then a = Iie : 
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2. Let us write down the expansion of R(z) in partial fractions: 


AP Ap Ai? | 
= ue ; . (56 
Peep es 2 tema tt ap we 
where Q(z) is a polynomial; we shall show how to construct the matrix § 
corresponding to R(z) from the numbers a and A. 

For this purpose we consider first the simple rational function 


It corresponds to the matrix 


Sa = || aft# lle. 
The form San(z, xz) associated with this matrix is 


n—1 
San (Z, 2) )= 3) att x x, = (Xo + ae, +e t+ aw 4). 


4, k=0 
If 
(f) 


q 
A , 
R (z) =Q (2) +o a—a;’ 
j= 
then by 1. the corresponding matrix S is determined by the formula 
: . +E ||oo 
ae, A® Sa, = Pe lo 
and the corresponding quadratic form is 
S,, (x, x) y= Sav (tot ajay tee +a; Sy). 
cat 


In order to proceed to the general case (56), we first differentiate the 
relation. 


gag ~ Sam |e***|lo 


h —1 times term by term. By 1. and 2., we obtain: 


1 1 gr-1 Sa 
(2—aj* ~ (A—1)! dah-1- 


=|(; vie Cres Gs) =Ofori+k<h—1 
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Therefore, by using rule 1. again we find in the general case, where R(z) 
has the expansion (56) : 
g a 1 a") 
~ — 2) |) eee ——— OG) 
R(z) ~ 8 2 (4h 4s hae shes 


daj 


sr) Say (67) 
By carrying out the differentiation, we obtain: 


=||_4? alt* + Al (44) Pca a + Al? a Ped a (rat 


re | 


The corresponding Hankel form S,(z,x) = >) 8:42:72, is 
i, k=wQ 


g &;- 
5,(2.2)= 5 (4? +AD +e oy Farr) 0+ aye, “ae ly _3)*. 


3. Now we are in a position to enunciate and prove the fundamental 
theorem :*? 


THEOREM 9: If 
R(z)~S8 


and m is the rank of S,*? then the Cauchy inder I+™ R(z) 1s equal to the 
signature" of the form 8,(2,2z) foranyn=m: 


I*+™ R(z)=0 [S, (z, z)]. 
Proof. Suppose that the expansion (56) holds. Then, by (57), 


s= ST, 


fol 


where each term is of the form 


_ 0 or—1 _ Hl gt Elleo 
Ta=(Ay+ Agge too + Ayo) 8a, Sa=\la'*tl\> (68) 


tn 
(y— 1)! 
and 


e 
S,(2,2)= > Ta; (4, x) = Po Ta; (2,2) + PS [Tas (x, 2) + Ts, (x, x)] 
j=1 a, real a, complex 


42 This theorem was proved by Hermite in 1856 for the simplest case where R(z) has 
no multiple poles [187]. In the general case it was proved by Hurwitz [204] (see alsc 
[25], pp. 17-19). The proof in the text differs from Hurwitz’ proof. 

43 As we have already mentioned, m is the degree of the denominator in the reduce: 
representation of the rational fraction k(z2) (see Theorem 8 on p. 207). 

44 We denote the signature of S,(z,2) by o[Sn(z, z)]. 
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By Theorem 8, the rank of the matrix T.,, and hence of the form T.,(z, x), 
P 

is vy (j= 1, 2,..., @) and the rauk of S,(2,2) is m= 3’ », But if the 
j=l 


rank of the sum of certain real quadratic forms is equal to the sum of the 
ranks of the constituent forms, then the same relation holds for the 


signatures : 


oO [S,, (x, z)] = > o [7'a; (z, x)] ats Pa o {To; (2, x) or Tr, (z, x)] ° (59) 


ay real a; complex 


We consider two cases separately : 


1) ais real. Under any variation of the parameters Ay, Ao,..., Api 
and a in 
A, Ay As 
g—atGcapt "+e oo 


the rank of the corresponding matrix 7, remains unchanged (~v) ; there- 
fore the signature of 7,(z,2) also remains unchanged (see Vol. I, p. 309). 
Therefore o[ T,(z, x) ] does not change if we set in (59) and (60): 4,;=...= 
A,_; =0 and a=0, ie., if for T, we take the matrix 


y—] | 
0 0 0 A, O O 
; 
1 O'Se__ . 
(y—1)! dar! a . 
A, 
0 
0 


The corresponding quadratic form is equal to 


9A (x | 4+ tlre + 9+ + 2,1%,) for v= 28, 
o 


A» (2 (Xg%o—1 + ee 


(¢=], 9 
+ ty gh) + af 4] forv=20—1, 13,00), 


912 XV. THE PrRosLemM or RoutH-HurRwWItTz AND RELATED QUESTIONS 


But the signature of the upper form is always zero and that of the lower 
form is sign A,. Thus, if a is real, then 


0, for even ¥ 


o [Ta (x, x)) =| (61) 


sign A,, for odd » 
2) a is complex. 


T.(x,2)= 3) (P, +1Q,)?, Tx (x, x)= DY (P, — iQ,)?, 
k=1 


k=1 


where P;, Q; (k=1, 2, ...,v) are real linear forms in the variables xo, 21, 
Zo,..+,%n—1. Then Z 


T(x, 2) + Tz (x,2)=2 37) P?~2 3’ Q. (62) 
k=1 k=1 


Since the rank of this quadratic form is 2v, the P;, O. (kK =1, 2,...,¥v) are 
linearly independent, so that by (62) for a complex a 


o [To (x, x) + Ta (x, 2)] =O. (63) 
From (59), (61), and (63) it follows that 


o[S,(z,2)J}= » sign A®, 
( ay; real ) 
y odd 


But on p. 175 we saw that the sum on the right-hand side of this equation 
is I+ R(z). This completes the proof. 
From this theorem we deduce: 


Coronary 1: If R(z) ~ S=|| size | and m is the rank of 8, then 
n—-1 
all the quadratic forms S,(2,2)= D7 5,,,%;,2, (n=m,m+1,...) have 
: 6, kx 0 
ond and the same signature. 


In Chapter X, § 10 (Vol. I, pp. 343-44) we established a rule for comput- 
ing the signature of a Hankel form; moreover, Frobenius’ investigations 
enabled us to formulate a rule that embraces all singular cases. By the 


45 Each of the products 22,1, 21%r-2,... can be replaced by a difference of squares 


(: sli | eet (® as J (arena), ees 


All the squares so obtained are lincarly independent. 


§ 11. DETERMINATION OF INDEX OF ARBITRARY RATIONAL FRACTION 21:3 


theorem above we can apply this rule to compute the Cauchy index. Thus 
we obtain: 
CoROLLARY 2: The index of an arbitrary rational function R(z) whose 


corresponding matrix S=|| si4x|/° 1s of ramk m, is determined by the 
formula 

It R(z)=m—2V (1, Dy, De, ..-, Dm), (64) 
where 


89 8) eee | 


ee, (f=1,2,...,m);- (65) 


Dy = 844 oO? = 
Sf_y Id eee Sos_o 
if among Dy, Do, ..., Dm there 1s a group of vanishing determinants*® 
(Di 79) Dag i="0°=Darp=9 (Dapper 9), 


then in the computation of V(D,, oe »+ +> Daroti) we can take 


_ 


sign D,,; =(— 1) ” sign D, (j=1,2,..., p) 


and this gives 
V(Dys Dazas - +++ Dasp+s) 


et : for odd Pp, 
= (66) 


p+ ee for even p and e= (— v7 : sign —~? PLES 


In order to express the index of a rational function in terms of the 
coefficients of the numerator and denominator we shall require some addi- 
tional relations. 

First of all, we can always represent R(z) in the form*’ 


ites g (2) 
R (z) = Q (2) T he? 
where Q(z), g(z), (2) are polynomials and 
h (z) =a gz™ + ay2"—1 + ++ + Ay (Ap 0), g (2) =Hgz™ + B,2z"™-1 +++ + 5,,. 


Obviously, 


#5 R@= 12 5S. 


46 Here we always have Du +0 (p. 206). 


47 It ig not necessary to replace F(z) by a proper fraction. For what follows it is 
sufficient that the degree of g(z) does not exceed that of h(z). 
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Let 
g (2) __ $9 41 
h (z) aE g ha 


If we now get rid of the denominator and then equate equal powers of z on 
the two sides of the equation, we obtain: 


. AS _1= by, 


a ee oe (67) 
DgSmy—y + By8q_g + °° * + Bq S_y =ba, 
gS, + 2181 + °° + ApS, =O (¢=m,m+1,...). 


Using (67), we find an expression for the following determinant of order 
2p im which we put a;=0, b;=0 for 3 > m: 


Gy Gy Gg... Og y l 0 0O0...0 | @y @, Ug--- gy} 
bo b; b. Ogy_1 8_} 89 8) - Sens 0 apg a Gap_2 
0 ao A ee = OQ 1 0 0 0 0 Ao Agp_s | 
0 bo b, * bs, 0 Sy So e Seng Say SEY OR Oe 
Saale Oe Gees vee dee te aan he ee 
Sp-1 8p $ap_-2 89 a Sp—1 
(p-1) 
=(-1) © app ltt Saves tar slgte| 8-7" | ate, (68) 
8 =? +s Spa Sp-1 Sp: - + Sap—a 
We introduce the abbreviation 
ao ay » Bgp_} 
by b, oe bes 
Veg =|90 Gy... Agyns (p=1, 2, ...; a, =b, =0 for 7 >m). (69) 
0 by... bg 
Then (68) can be written as follows: 
Vip=a?D, (p=1,2,...). (68’) 


By this formula, Corollary 2 above leads to the following theorem : 
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TuEorEM 10: If Vy. 0," then 


} -1 eo Bm 
6 a =~ 2F (1, Day Pays Pam) (00), (70) 


where Vo, (p=1, 2,..., m) is determined by (69) ; of there 1s a group of 
zero determinants 


(Vienx0) Vanps=--*=Vanrep=9 (Varnrenta 9), 


then in computing V (Vo, Verse, - - +» Venrenre) we have to set: 


40-1) 
a 


sign V4.4, =(—1) sign V4, G= 1, 27.463,) 


or, what is the same, 


-_ for odd p 
V Vans «+r Vansep+2) = 


ee for even p and e= (=I) sign “Ate, 
Note. If V,,, 0, ie., if the fraction under the index sign in (70) is 
reducible, then (70) must be replaced by another formula 


byzm + Byzm—2 4 ee $d . 
+ cc % 1 DN ote =, 
to gan ae ee, Peer ear te aneeg 2V(1,V2,,04,...,V2,), (70) 
where r is the number of poles (including multiplicities) of the rational 
fraction under the index sign (i.e., 7 is the degree of the denominator in the 
reduced fraction). 

For in this case the index we are interested in is 


r—2V(1, D,, Dz, ..., D,), 


sinee r is the rank of the corresponding matrix S = || 5,4 age But the 
equation (68’) is of a formal character and also holds for reduced fractions. 
Therefore 


V (1, Dy, Dg, ..« D,) =V (1, V9, V4, ..0 Veg), 


and we have reached (70’). 

Formula (70’) enables us to express the index of every rational fraction 
in which the degree of the numerator does not exceed that of the denominator 
in terms of the coefficients of numerator and denominator. 


48 The condition Vg, ~ 0 means that D,, 50, so that the fractioh under the index 
sign in (¥0) is reduced. 
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§ 12. Another Proof of the Routh-Hurwitz Theorem 


1. In § 6 we proved the Routh-Hurwitz theorem with the help of Sturm’s 
theorem and the Routh algorithm. In this section we shall give an alterna- 
tive proof based on Theorem 10 of § 11 and on properties of the Cauchy 


indices. 
We mention a few properties of the Cauchy indices that will be required 


in what follows. 

1. RR(2) =—GR(2).” 

2. T° Ry(r) R(x) =sign Ri (z) P?R(x) if R(x) A0, © within the inter- 
val (a,b). 

3. Ifa<cc<bd, thenT? R(x) =I R(x) + TR (x) + Ney where ne = 0 Af 
R(c) ts finite and yne-= + 1 tf R(x) becomes infinite at c; here n-= +1 
corresponds to a jump from — 0 to + o atc (for increasing x), and n,-=—1 
toa jump from + « to —o. 

4. If R(—2z) =—R(z), then??? R(x) =12R(r). If R(—2z)=R(z), 
then I? R(x) =—2R(z). 

Eq — £p 


5. I°R(x) + 13(1/R(x)) =——g—, where eq is the sign of R(x) within 
(a,b) near a and &, ts the sign of R(x) within (a,b) near b. 


The first four properties follow immediately from the definition of the 
Cauchy index (see § 2). Property 5. follows from the fact that the sum of 
the indices I R(z) and r Kiss is equal to the difference n; — m2, where 7 
is the number of times R(z) changes from negative to positive when z 
changes from a to b, and nz the number of times R(z) changes from positive 
to negative. 

We consider a real polynomial” 


f (2) = aga” + ay2"-1 + aga"? + +--+ a, 2+4, (o>), 
We can represent it in the form 


f (z)= h (2) + 2g (2*), 


where 
h (wu) =a, + Gy, gt ter, g (u)==a,_, + Ay _g& +s. 


49 Here and in what follows the lower limit of the index may be — oo and the upper 
limit may be + ©. 
50 We have here reverted to the usual notation for the coefficients of a polynomial. 
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We shall use the notation 


a,2z"~} o2S3 asz"—8 a 


—jyt@ 
e=l-. Qy2" — a,z"—2 + - (Th) 
In § 3 we proved (see (20) on p. 180) that 
eo=n—2k—s, (72) 


where & is the number of roots of f(z) with positive real parts and s the 
number of roots of f(z) on the imaginary axis. 

We shall transform the expression (71) for @. 

To begin with, we deal with the case where nis even. Letn=2m. Then 


h(u)=agu™+agu"™-!+---+a,, g(u) au"! + agu™? 4+--++a4, 4. 


(x) 
Using the properties 1.-4. and setting 7 = + lif Jim mG (yy ck 0, respec. 


tively, and 7 = 0 otherwise, we have: 


a 2 
g227 = Te n= ~ (Ie tg + n= —2 I ta = 
g (— 2°) = g (u) _ zo g(%) o (Ug (u) 

= 27. {2 — y= 2h Hn — 9 gy ee Fal 


— zroo G4) pte ae) 
—~ f (u) —~ h (u) 


Similarly we have for odd n, n=2m+1: 
h (u) =a,u™ + agu™-2 +--+ +a,, g(u)=agu™ + au”) + +++ +a, 4. 


Setting®! = sign Se no if lim m 2) . i) = 0 and £ =0 otherwise, we find: 


h (— 2?) 
— 2g (— 24) 


h (— 2) 
—= 2g (— 24) 


7 Alu) h(t) pte B(¥) py eo h(u) 
= To ug fu) ~ I= guy TO = Ion ug (uy ~ Foe g(a)’ 


Thus 


h 
earte MHA) _ yo arte geo +o=2 = * 


yt 


51 Here we mean by sign [g(w)/h(u)]},,.9—-the sign of g(u)/h(u) for negative values 
of u of sufficiently small modulus. 

52 If o ~ 0, then the two formulas (73’) and (73”) may be combined into the single 
formula 


+e0 9 (4) + 0 h (u) one 
= Foe a) t= ag (uy we) 
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garg see (n= 2m), (73’) 


h oo ” 
emits 2 rte (n= 2m +). 73" 


As before, we denote by 41, 4e,..., 4, the Hurwitz determinants of f(z). 
We assume that A, 5 0.°° 
1) n=2m. By (70), 


142 $= m—2V(, Ay, Ag, «++» ya); (74) 


t= ao —m—2V (1, — dg, + 4y,— yy.) 


=—m-+ 2V (1, Ag, 71 Pree A,). (75) 
But then, by (73’), 


o=n—2V (1, A,, Agog )—27 (1, A,, rd A,,) 


n—1 
which in conjunction with e =n — 2k gives 


k= V (1, A,, A, $5 Sik A,-1) + V (1, A. As, ee ey A,). (76) 
2) n=2m+1. By (70), 


re AO im 41-20, Ay dy 4), (77) 
+2 Om —2V 1, — Ay + Ay) 


= —m+2V (1, 45, dg, .--, 4,4): (78) 


The equation @ = 2m + 1 — 2k together with (73’’), (77), and (78) again 
gives (76). 
This proves the Routh-Hurwitz theorem (see p. 194). 

53 In this case s — 0, so that g ==» — 2k. Moreover, 4, 3% 0 means that the fractions 
under the index signs in (73’) and (73”) are reduced. 

54 In computing V,,V,,..., Vom the values a, di, ..., Gm and bo, bi, ..., bm must be 
replaced by do, da, ..., Gem and 0, di, Gs, ..., Gom-» respectively in computing the first index 
and by do, da,..., Gam 20d Gh, Gs,..., Gam-i, 0 respectively in computing the second index. 

56 In computing the first index in (70) we take do, aa,..., Gam, 0 and 0, di, G3, ...) Gan+a,, 
respectively, instead of ao, i, ..., @n+1 ANd Do, Bi1,..., Om+i; and in computing the second : 
index we take a:, Gs, ..., Grav+1 BD Ge, de, ..., Gom, respectively, instead of do, di, ..., Ge: 
and be, d1,..., Om. ! 
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2. Note 1. If in the formula 
k=V(l, 4,, As,...) + V (1, 4g, 4g, -.-) 


some intermediate Hurwitz determinants are zero, then the formula remains 
‘valid, only in each group of successive zero determinants 


(4, 0) Aye = A+ = =Aiey= 0 (4499427 0) 
the following signs must be attributed to these determinants (in accordance 
with Theorem 7) 
14-1) 
sign 4,,2,=(—1) * 
which yields: 


sign A, (j=1, 2, eeey p), 


P+! for odd p, 
(79) 
Asset 


V(4;, Ars, eas bly A y409+2) = 


P 


—*\ for even p and ¢ =(— 1) *sign 422 


A careful comparison of this rule for computing k& in the presence of 
vanishing Hurwitz determinants with the rule given in Theorem 5 (p. 201) 
shows that the two rules coincide.” 


Note 2. If 4,=0, then the polynomials ug(u) and h(w) are not co- 
prime. We denote by d(x) the greatest common divisor of g(u) and h(u) 
and by u7d(u) that of ug(u) and h(u) (y=Oorl). We denote the degree 
of d(u) by 6 and we set h(u) =d(u)hi(u) and g(u) =d() gi(u). 

The irreducible rational fraction g:(u)/hi(«) always corresponds to an 


infinite Hankel matrix §= || 5,41 || of rank r, where r is the degree of 
hi(u). The corresponding determinant D, 70 and D,,,;=—D,4,2=...=0. 
By (68’) V5, 0, Vee =Va4e=...=0. Moreover, 


ad a a= r—2V (1, V5, ... + Vo). 


When we apply all this to the fractions under the index sign in (74), (75), 
(77), and (78) we easily find that for every » (even or odd) and x= 26 + y 


A, +179, A,_«%9, Ayn 41 myer =4,=0 


and that the formulas (74), (75), (77), and (78) all remain valid in this 
case, provided we omit all the 4, with 1 > n — x on the right-hand sides and 
replace the number m (in (77), m+ 1) by the degree of the corresponding 


56 We have to take account here of the remark made in footnote 36 (p. 201). 
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denominator of the fraction under the index, after reduction. We then 
obtain by taking (73’) and (73) into account: 


o=—n—x—2V (1, 4,, 4g,...)—2V (1, 4g, Ma, ---)- 
Together with the formuia e =n — 2k — s this gives: 
k,= V (1, 4y, 43,.--) + V(1, 4g, 4a, .--), (80) 
where kj =k + s/2 — x/2 is the number of all the roots of f(z) in the right 


half-plane, excluding those that are also roots of f(— z).*” 


§ 13. Some Supplements to the Routh-Hurwitz Theorem. 
Stability Criterion of Liénard and Chipart 


1. Suppose given a polynomial with real coefficients 
f (z) = oz" + ayz™~=1 +++ +a, (ay>0). 


Then the Routh-Hurwitz conditions that are necessary and sufficient for 
all the roots of f(z) to have neg sttve real parts can be written in the form 
of the inequalities 


A,>0,4,>0,...,4,>0, (81) 

where 

Gd, Ag ap... 

Ay Gy My... 

0 @ ay... . 

A;=| 9 ly Ay % ‘| (a,=0 for k> n) 
a; 

is the Hurwitz determinant of order 7 (t=1, 2,..., 1). 


If (81) is satisfied, then f(z) can be represented in the form of a product 
of a with factors of the form 2+ u, 22+ vz+w (u>0,v >0, w > 0), so 
that all the coefficients of f(z) are positive :°° 


57 This follows from the fact that x is the degree of the greatest common divisor of 
h(u) and ug(u); * is the number of ‘special’ roots of f(z), i.e., those roots 2* for which 
—z* is also a root of f(z). The number of these special roots is equal to the number of 
determinants in the last uninterrupted sequence of vanishing Hurwitz determinants 
(including 4,): Apg_x43 ==*+* = 4, = 

58 gq, > 0, by assumption. 
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a,>0, a,>0,.:.,a,>0. (82) 


Unlike (81), the conditions (82) are necessary but by no means suffi- 
cient for all the roots of f(z) to lie in the left half-plane Rez < 0. 

However, when the conditions (82) hold, then the inequalities (81) are 
trot independent. For example: For n=4 the Routh-Hurwitz conditions 
reduce to the single inequality A; > 0; forn =—5, to the two: A, > 0, 4, > 0; 
for n = 6 to the two: A; > 0, A; > 0. 

This circumstance was investigated by the French mathematicians 
Liénard and Chipart® in 1914 and enabled them to set up a stability criterion 
different from the Routh-Hurwitz eriterion. 


THEOREM 11 (Stability Criterion of Liénard and Chipart): Necessary 
and sufficient conditions for all the roots of the real polynomial f(z) = 
az” +a,z"14---+ a, (& >0) to have negative real parts can be given 
im any one of the following four forms :° 


1) a,>0, a, 2>9,...; 4,>0, 4,>0,...; 
2) a,>0, a,_,.>0,...; 4g>0, 4g>0, .-.; 
3) a,>0;4,_,>0,4,,>0,...; 4, >9, 4s>0,..., 
4) a,>0;4,_14>0;4,-3>0,...; A,>0,-4,>0,.... 


From Theorem 11 it follows that Hurwitz’s determinant inequalities (81) 
are not independent for a real polynomial f(z) = a,z* + a,z*-i +---+a, 
(ad > 0) in which all the coefficients (or even only part of them: Gn, da—e, 

» OF Aq, G_1>%n_-3,+--) are positive. In fact: If the Hurwitz determinants 
of odd order are positive, then those of even order are also positive; and 
vice versa. | 

Liénard and Chipart obtained the condition 1) in the paper [259] by 
means of special quadratic forms. We shall give a simpler derivation of the 
condition 1) (and also of 2), 3), 4)) based on Theorem 10 of § 11 and the 
theory of Cauchy indices and we shall obtain these conditions as a special 
case of a much more general theorem which we are now about to expound. 

We again consider the polynomials h(u) and g(u) that are connected 
with f(z) by the identity 


59 This fact has been established for the first few values of 2 in a number of papers 
on the theory of governors, independently of the general criterion of Liénard and Chipart, 
with which the authors of these papers were obviously not acquainted. 


60 See [259]. An account of some of the basic results of Liénard and Chipart can be 
found in the fundamental survey by M. G. Krein and M. A. Naimark [25}. 

61 Conditions 1), 2), 3), and 4) have a decided advantage over Hurwitz’ conditions, 
because they involve only about half the number of determinantal inequalities. 
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f(z) =h (2*) + zg (2). 
If n is even, n = 2m, then 


h(u)=agu™+ agu™~1+---+a,, g(u)=a,u"—-! + agu™-24 ---+.4,_3; 
if ” is odd, n= 2m + 1, then 
h (u) =ayu™ + agu™-* + +++ +a,, 9 (u) =agu™ + agu™-1 +--+» +a, 4. 


The conditions a, > 0, a,-2 > 0, ... (or da—1 > 0, Ga_s > 0, ...) can 
therefore be replaced by the more general condition: A(u) (or g(u)) does 
not change sign for « > 0.° 

Under these conditions we can deduce a formula for the number of roots 
of f(z) in the right half-plane, using only Hurwitz determinants of odd order 
or of even order. 


THEOREM 12: If for the real polynomial 
f (z) = doz” + ayz"-) + +++ + a, =h (z*) + 2g(z*) (ag > 0) 


") (or g(u)) does not change sign for u > 0 and the last Hurwitz deter- 
menant A, 540, then the number k of roots of f(z) in the right half-plane 
4s determined by the formulas 


n = 2m n=2m+1 
re 1l—e 
‘Change (2 27 (hs Ap Ay. . bk =27(1, Ay Ay 5 Aq) — M5 
sign ae 
for =2V (1, Ay Ay ee =2V7 (1, Ay; Aw eee 4n-1) + 9 
u>0 
g(u) (83) 
Evo E j= 
ene bk 2V(L, Ay Aap «++ Apna) + Sk = 2V (1, Ay, Ags -- 1» Sy) — 2 
sign snee Peers 1% 
for — 2V (1, As Ay coe A.) == 2 — 2V (1, Ay, Lg oo eg 4y_,) + 3 
U > 0 s . . % 
where® 9 (u) ts 
oo == Sl —— = si oe 
ee i ae ro = SBN bh a (84) 
82 Te, h(u) 20 or h(u) SO for u>0 (g(u) 20 or g(uv) S0 for u>0). 
3 If a, 0, then e~ =—signa,; and, more generally, if a: a)—=...== do,—1== 0. 


Qen41% 0, then e.. == sign doy41. If an-1 ~ 0, theft, = sign Gn-:/an; and, more generally, 
if a,_) = dg_g==---=an—2n-1 = 0 and Gn_2y—) % 0, then eg = sign ag—2s—1/Gn. 
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Proof. Again we use the notation 


_ Corresponding to the table (83) we consider four cases: 


1) n=2m; h(u) does not change sign foru > 0. Then 


wo 9 (2) oo MG (U) 
Io" a) F0” Fwy = 9 


and so the obvious equation 


e.9(4) yo tate) 


— (uy h (u) 
imphes that :* 
pre 9.0) _ pte Ug (4) 
—= h (u) — A(u) © 


But then we have from (74) and (75): 
V (1, 4, 43, ...) = VQ, 4, 4g, ...), 
and therefore the Routh-Hurwitz formula (76) gives: 
k=27 (1, 4, 45,...,4,-1) = 2V (1, 4e, 44,--+, 4,)- 
2) n=2m; g(w) does not change sign for u > 0. In this case, 


took (U) proo Alu) 
10 Gu) 10 ag (uy 9 


h(u) h He 


so that with the notation (84) we on o 


OF Giga Oe Sa (85) 


oo h (ut) 
i-. ug (u) 


—~ 9 (u) 


When we replace the functions under the index sign by their reciprocals, 
then we obtain by 5. (see p. 216): 


wa 9M) 5 pm 49 1) 
set IC Mae TC) 


= E.. — Eq. 


64If h(w)=0 (wu > 0), then g(t) #0, because 4,40. Therefore h(u) 20 
(u > 0) implies that g(z)/h(u) does not change sign in passing through u = t:: 
$5 From 4, == 4, 4,_1>5 0 it follows that h(0) = dn 9 0. 
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But this by (74) and (75) gives: 


V(1,43,4-..)—V(1, 4, dy, ...) ="2>*. 


Hence, in conjunction with the Routh-Hurwitz formula (76), we obtain: 


Eco 


k=2V (1, 4, 4s, oe -) + 


5 =2V (1, 4g, 4...) — tes 

3) n=2m +1, g(u) does not change sign for u > 0. 

In this case, as in the preceding one, (85) holds. When we substitute 
the expressions for the indices from (77) and (78) into (85), we obtain: 


1— eq 


V (1, 4,, 45, -».)— V(1, 49, 4g, ---) = 


In conjunction with the Routh-Hurwitz formula this gives: 


#=2V(1, Ay, 4,,...)— 5% =2V (1, 4, dy...) + 5. 


4) n=2m+1, h(u) does not change sign for u > 0. 
From the equations 
oo 9 (U4) yoo %-9(%) 0 g (u) wg(u) 
I, fo =I, > hw) =0O and J_.W iw) cay Aaa say 
we deduce: 


4009 (t) , p40 ug (u) 
Fra) tI hwy = 9 


Taking the reciprocals of the functions under the index sign, we obtain: 


h(w) 
ug(u) ™ 


pre tl) Aveo 
g (%) tie 
Again, when we substitute the expressions for the indices from (77) and 
(78), we have: 


ere 
V(1, Ay, 4g,-..)—V(1, 9, dg...) =—G. 


From this and the Routh-Hurwitz formula it follows that: 


k=2V (1, Ay, dy...) —'G =2V (1, My dy.) + 2G. 


This completes the proof of Theorem 12. 


From Theorem 12 we obtain Theorem 11 as a special case. 
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2. COROLLARY TO THEOREM 12: If the real polynomial 
f (2) =ay2"* + a2" 1+4--++4, (45> 0) 
has positive coef fictents 
@g>0, a,>0, a,>0,... ,a,>0, 


and A, 0, then the number k of its roots in the right half-plane Rez > 0 
is determined by the formula 


k=2V (1, Ay, dg...) =2V (1, Me, Mg,» 3): 


Note. If in the last formula, or in (83), some of the intermediate Hurwitz 
determinants are zero, then in the computation of V(J, 4;, 43, ...) and 
V(1, 42, 4s,...) the rule given in Note 1 on p. 219 must be followed. 

But if 4, = 4,1) =...= An_-n 41 =0, Ap_,x 0, then we disregard the 
determinants 4,_,43,..., 4, in (83)* and determine from these formulas 
the number k, of the ‘non-singular’ roots of f(z) in the right half-plane, 
provided only that h(u) #0 for u>0 or g(u) 0 for u > 0." 


§ 14. Some Properties of Hurwitz Polynemials. Stieltjes’ Theorem. 
Representation of Hurwitz Polynomials by Continued Fractions 


1. Let 
f (z) = ag2” + ayz"-1 + +++ +a, (dy 0) 


be a real polynomial. We represent it in the form 


f (z) =h (2*) + 2g (2%). 


We shall investigate what conditions have to be imposed on h(u) and 
g(«) in order that f(z) be a Hurwitz polynomial. 

Setting k =s=0 in (20) (p. 180), we obtain a necessary and sufficient 
condition for f(z) to be a Hurwitz polynomial, in the form 


e=N, 
where, as in the preceding sections, 


__ yoo 22-1 —agzn—3 ++ --- 
e=I=> Qoz™* — a,zn—-2 4-.-- 7 


66 See p. 220. 
67 In this case the polynomials hi(u) and g:(u) obtained from h(u) and g(u) by 
dividing them by their greatest common divisor d(u) satisfy the conditions of Theorem 12. 
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Let n=2m. By (73’) (p. 218), this condition can be written as follows: 
ae _ Y+oo g (u) + co ug (w) 
t= am Te a) ~ I hay ; (86) 


Since the absolute value of the index of a rational fraction cannot exceed 
the degree of the denominator (10 this case, m), the equation (86) ean 
hold if and only if 


oo 9 (4) _ oo Ug (%) __ 


hold simultaneously. 
For n= 2m +1 the equation (73”) gives (on account of 9 =n): 
— too A(t) pp 0h(u) 
= Io ag (u) —" g (us) ” 
When we replace the fractions under the index signs by their reciprocals 


(see 5. on p.216) and observe that h(w) and g(u) are of the same degree m, 
we obtain :*° 


peed) _ pro tM) |g (88) 


BE ae hile) Ms) 


Starting again from the fact that the absolute value of the index of a fraction 
cannot exceed the degree of the denominator we conclude that (88) holds 
if and only if 


ite a Sig JES aa —_m ands, =1 (89) 


hold simultaneously. 

If n= 2m, the first of equations (87) indicates that h(w) has m distinct 
real roots wu; < ue<...< Um and that the proper fractions g(u)/h(«) 
can be represented in the form 


Ray es (90) 


where 


R, = $F >0 (i=1,2, ..., m). (90’) 


From this representation of g(u)/h(u) it follows that between any two 
roots w4, 41 of A(u) there is a real root u,’ of g(u) (¢=1, 2,..., m-—1) 
and that the highest coefficients of h(w) and g(«) are of like sign, 1.e., 


g (u) 


68 Ag in the preceding section, ¢. = sign h(t) | uce + oo ° 
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h (u) = dy (u —t,) +++ (w—4,,), g (u) =a, (u—u)+--(u—4,_4), 
Uy < Uy << gS << << Uns aya, > 0. 
The second of equations (87) adds only one condition 


Um <0. 


By this condition all the roots of h(u) and g(w) must be negative. 
If n=2m-+1, then it follows from the first of equations (89) that 
h(u) has m distinct real roots uy. < ue<... < Um and that 


eat Dir aig 0), (91) 
where 
R,= ae >O  (6=1,2,...,m). (91) 
The third of equations (89) implies that 
8_4 >0 ’ (92) 


ie., that the highest coefficients ao and a, are of like sign. Moreover, it 
follows from (91), (91’), and (92) that g(u) has m real roots uv’; < u’e<... 


< wu’ in the intervals (— x, U1), (try U2),~.-, (Um—1, Um). In other words, 
h (u) = a, (u— Uy) ++ (U— Uy), g (u)= Ag (uU— uj) -++(U—4U,,) , 
Uh <b << Ug < tg << Ug; Aya, > 0. 


The second of equations (89), as in the case n= 2m, only adds one further 
inequality 
Um <0. 


DerFInition 3. We shall say that two polynomials h(u) and g(u) of 
degree m (or the first of degree m and the second of degree m—1) forma 
positive pair® if the roots uy, U2, ..., Um ONE Uy, Uy, . sy Um (OT Wy 44, 2 
u,,_,) are all distinct, real, and negative and they alternate as follows: 


Uy < Uy << Ug Sg <<, <u, <0 
(Or Uy <u, < tg << uy <u, <0) 
and their highest coeffictents are of like sign.” 


6° Bee [17], p. 333. The definition of a positive pair of polynomials given here differs 
slightly from that given in the book [17]. 


70 If we omit the condition that the roots be negative, we obtain a real pair of poly- 
nomials. For the application of this concept to the Routh-Hurwitz problem, see [36]. 
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When we introduce the positive numbers »,—= — “4; and v;/ =— u,’ and 
multiply h(w) and g(w) by + 1 or —1 so that their highest coefficients are 
positive, then we can write the poly:.omials of this positive pair in the form 


h(u)= ay, I (ut), g(u)=a W (u + vi), (93) 


i=] t=] 
where 


a,>0, a >0, 0<4, <0, <U__1< Umi << Oy <<}, 


in case both h(u) and g({«) are of degree m, and in the form 


m m—1 
h(u) =a TT (w+), g (u) =a, JJ (u+%), (93’) 


ea] i=] 
where 


Ay>0,4,>0, 0< 0, <n < Mma <<, 


in case h(u) is of degree m and g(u) of degree m — 1. 
By our earlier arguments we have proved the following two theorems: 


THEorEM 13: The polynomial f(z) =h(2?) + zg(z2?) is a Hurwitz poly- 
nomial if and only if h(u) and g(u) form a postive pair.” 


THEOREM 14: Zwo polynomials k(u) and g(u) the first of which ts of 
degree m and the seconu of degree m or m—1 form a positive patr of and 
only if the equations 


oo 9 (4) __ oo Ug (U) __ 
Iria ™ A 977 (94) 


hold and, when h(u) and g(u) are of equal degree, the additional condition 


Eo == sign ak Ron i (95) 


holds. 


2. Using properties of the Cauchy indices we can easily deduce from the 
last theorem a theorem of Stieltjes on the representation of a fraction 
g(u)/h(w) as a continued fraction of a special type, provided h(u) and 
g(w) form a positive pair of polynomials. 

The proof of Stieltjes’ theorem will be based on the following lemma: 


71 This theorem is a special case of the so-called “Hermite-Biehler theorem (see [7}, 
p. 21). 
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Lemma. If the polynomials h(u) and g(u) (h(u) of degree m) form 
a positive paar and 


1) en aT 
iu) 6 Fw’ ba 
91 (4) 


where c, d are constants and hi(u), gi(u) are polynomials of degree not 
exceeding m —1, then 

1 c2Zz0,d>0; 

2. hi(u), gi(u) are of degree m—1; 

3. hy(u) and g1(u) form a positive pair. 

Given h(u) and g(u), the polynomials hi(u) and gi(w) are uniquely 
determined (to within a common constant factor) and so are c and d. 

Conversely, from (96) and 1., 2., 3. wt follows that h(u) and g(u) form 
a positive paar, that h(u) 1s of degree m, and g(uw) ws of degree m or m—1 
according asc > 0 orc=0. 

Proof. Let h(w), g(v) be a positive pair. Then it follows from (94) 
and (96) that: 


SSNS) =e gs te (97) 
J; (#) 


This equation implies that g,(u«) is of degree m —1 and that d ~~ 0. 
Further, from (97) we find: 


h, (u) 
91 (¥) 


hy (u) 
gy (u) 


m= —Ii2|du + |+ sign d =—r1t2 2 +signd. 


Hence it follows that d > 0 and that 


oo 2 
142 3 =— (m—D). (98) 


The second of equations (94) now gives: 


1 
hy (w) | 
Fae eid 
= Ug; (u) 


Hence it follows that h,(u) is of degree m — 1. 
Condition (95) yields, by (96):c>0. Butif g(u) is of smaller degree 
than h(u), then it follows from (96) that c=0. 
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(98) and (99) imply: 


+o 91 (u) 


SS 142 m+, (100) 


ES Ae ee) 


where 


91 (¥ Vi 


ew = sign fe (tt) | wm too” 


Since the second of the indices (100) is in absolute value less than m — 1. 
we have 


=I], (101) 


and then we conclude from (100) and (101), by Theorem 12, that the 
polynomials h,(u) and g;(#%) form a positive pair. 
From (96) it follows that 


— 1s 9 (4) . [g (t) wo 
e=limt gy lim [8a 


hy (u) 
gy (1) 

The relations (97), (98), (99). (100), and (101) applied in the reverse 
order, establish the second part of the lemma. Thus the proof of the lemma 
is complete. 

Suppose given a positive pair of polynomials hk(u), g(u), with h(u) 
of degree m. Then when we divide g(u) by A(w) and denote the quotient 
by co and tke remainder by g,(«). we obtain: 


After c and a nave been found, the ratio is determined by (96). 


7() 91 (4) 1 
R(u) OF Ruy OF Key” 
91 (%) 
h(a) hy (*) . 
guy can be represented in the form dou + - oa)! where h,(), like gi(«), 
1 


is of degree less than m. Hence 


h(a) Tw * (102) 
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Thus, the representation (96) always holds for a positive pair h(u) and 
g(u). By the lemma 
Co = 0, do > 0, 


and the polynomials h,(u) and gi(u) are of degree m — 1 and form a posi- 
tive pair. 

When we apply the same arguments to the positive pair hi(u), gi(1), 
we obtain 


91 (w) — l e 
hy Ci eee. hy (uy ? (102") 
: Jz (%) 
where 
¢; > 0, d; > 0, 


and the polynomials h2(u) and y2(u) are of degree m — 2 and form a posi- 
tive pair. Continuing the process, we finally end up with a positive pair 
hm aNd gm, Where h» and gm are constants of like sign. We set: 


$m —e. (102™) 
Then it follows from (102), (102’),..., (102°™)) that: 


g(u)_ 1 
bu) OF 


dou + —— 


° ] 
$—— 
amu + — 
Cm 


Using the second part of the lemma, we show similarly that for arbitrary 
Co 0, 1 > 0,..- em > 0, do > 0, di > O, ..., dn—-1 > O the above con- 
tinued fraction determines uniquely (to within a common constant faetor) a 
positive pair of polynomials h(u) and g(u), where h(u) 1» of degree m and 
g(u) is of degree m when cy > 0 and of degree m— 1 when cy = 0. 


Thus we have proved the following theorem.”? 


72 A proof of Stieltjes’ theorem that is not based on the theory of Cauchy indices can 
be found in the book [17], pp. 333-37. 
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TuroremM 15 (Stieltjes): If h(u), g(w) ts a positive pair of polynomials 
and h(wu) is of degree m, then 
g (u) 1 


h (uy = COT | 
(1) dgu + area 


(103) 


, -] 
area ee 
du + ar 


Mm 


where 
Co = 0, 0,>0,...,¢,>0, d,>O0,...,d,_,>0. 


Here co =0 if g(u) 1s of degree m—1 and cy > Otf g(u) 1s of degree m. 
The constants ci, d;, are uniquely determined by h(u), g(u). 

Conversely, for arbitrary co=0 and arbitrary positwe Ci, ..., Cm, 
“do, ..., AIm—1, the continued’ fraction (103) determines a positive par of 
polynomials h(u), g(w), where h(u) ws of degree m. 

From Theorem 13 and Stieltjes’ Theorem we deduce: 


THEOREM 16: A real polynomial of degree n f(z) =h(z?) + 2g(2?) isa 
Hurwitz polynomial if and only if the formula (103) holds with non- 
negative Co and positive c1,..., Cm, do,.-.,Am—-1. Here co > 0 when nis odd 
and Cy == 0 when n 1s even. 


§ 15. Domain of Stability. Markov Parameters 


1. With every real po1ynomial of degree m we can associate a point of an 
n-dimensional space whose coordinates are the quotients of the coefficients 
divided by the highest coefficient. In this ‘coefficient space’ all the Hurwitz 
polynomials form a certain n-dimensional domain which is determined” by 
the Hurwitz inequalities 4; > 0, 42 >0,..., 4, > 0, or, for example, by 
the Liénard-Chipart inequalities a, > 0, d,_2 > 0,..., 4; > 0, 4; > 0,.... 
We shall call it the domain of stability. If the coefficients are given as 
functions of » parameters, then the domain of stability is constructed in the 
space of these parameters. 


78 Wor = 1. 
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The study of the domain of stability is of great practical interest; for 
example, it is essential in the design of new systems of governors.’* 

In § 17 we shall show that two remarkable theorems which were found 
by Markov and Chebyshev in connection with the expansion of continued 
fractions in power series with negative powers of the argument are closely 
connected with the investigation of the domain of stability. In formulating 
and proving these theorems it is convenient to give the polynomial not by 
its coefficients, but by special parameters, which we shall call Markov 
parameters. 

Suppose that 


f(z) = age" + ayzI 4a, (a9 X0) 


is a real polynomial. We represent it in the form 


f (2) == h (2°) + 2g (2*). 

We may assume that h(w) and g(u) are co-prime (4,340). We expand 
the irreducible rational fraction —— . =) in a series of decreasing powers of u :7° 
at tk Bp, (104) 
The sequence So, 51, Se, ... determines an infinite Hankel matrix 

S= | S44% lle We define a rational function R(v) by 
R(v) = —2 a (105) 

Then 

Rv) =— 6+ e+ Hp Bhs, (106) 


so that we have the relation (see p. 208) 
R(v) ~ 8. (107) 


Hence it follows that the matrix S is of rank m= [n/2}, since m, being 
the degree of h(w), is equal to the number of poles of R(v).”° 
For n= 2m (in this case, s_1;=0), the matrix S determines the irre- 
ducible fraction ae uniquely and therefore determines f(z) to within a 
74 A number of papers by Y. I. Naimark deal with the investigation of the domain of 
stability and also of the domains corresponding to various values of k (% is the number of 
roots in the right half-nlane). (See the monograph [41].) 


75 In what follows it is exnvenient to denote the coefficients of the even negative powers 
of u by — &1, — 82, etc. 
76 See Theorem 8 (p. 207). 
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constant factor. For n= 2m + 1, in order to give f(z) by means of S it is 
necessary also to know the coefficient s_1. 
On the other hand, in order to give the infinite Hankel matrix S of rank 


m it is sufficient to know the first 2m numbers So, 8), ..., Sam—1. These 
numbers may be chosen arbitrarily subject to only one restriction 

Dy =| 8:44 (PO 5 (108) 
dll the subsequent coefficients Som, Som+1,... Of (104) are uniquely (and 


rationally) expressible in terms of the first 2m: So, 81, ..., Sem—1. For in the 
infinite Hanke! matrix S of rank m the elements are connected by a recur- 
rence relation (see Theorem 7 on p. 205) 


8 = a fog (q=m, m+1, ...). (109) 


If the numbers S$, S:, ..., Sm—1 Satisfy (108), then the coefficients a1, a2,..., 
Q@m in (109) are uniquely determined by the first m relations ; the subsequent 
relations then determine Som, Som+i)--> - 

Thus, a real polynomial f(z) of degree n = 2m with 4, ~ 0 can be given 


uniquely’* by 2m numbers So, $1, ..., Som—1 Satisfying (108). When 
n=2m +1, we have to add s_, to these numbers. 
We shall call the m values So, 51, ..., Som—1 (for ~= 2m) or S_1, So,.-., 


Son—1 (for n=2m+1) the Markov parameters of the polynomial f(z). 
These parameters may be regarded as the coordinates in an n-dimensional 
space of a point that represents the given polynomial f(z). 

We shall find out what conditions must be imposed on the Markov para- 
meters in order that the corresponding polynomial be a Hurwitz polynomial. 
In this way we shail determine the domain of stability in the space of Markov 
parameters. 

A Hurwitz polynomial is characterized by the conditions (94) and the 
additional condition (95) for n=2m-+1. Introducing the function R(v) 
(see (105) ), we write (94) as follows: 


I*@R(v)=m, It@vR(v) =m. (110) 
The additional condition (95) for n=2m+1 gives: 
8_,>0. 
Apart from the matrix S= || 4% | ,° we introduce the infinite Hankel 
matrix 8 = || sipngi ||. Then, since by (106) 


77 To within a constant factor. 
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? 


oR (v) =— Oe ee 


the following relation holds: 
vR(v) ~ 8, (111) 


The matrix S“), like S, is of finite rank m, since the function vR(v), like 
R(v), has m poles. Therefore the forms 


m—1 m—1 
(1) 
S,,(z, 2) = » 8:4 40;%,, Sy (x, 2) = » Sip epi Uy 
i, k=0 i, k=0 


are of rank m. But by Theorem 9 (p. 190) the signatures of these forms, 
in virtue of (107) and (111), are equal-to the indices (110) and hence also 
tom. Thus, the conditions (110) mean that the quadratic forms S,,(x, x) and 
S) (2, 2) are positive definite. Hence: 


THEOREM 17: A real polynomial f(z) =h(2?) + 2g(2*) of degreen =2m 
orn=2m+1 is a Hurwitz polynomial if and only if :™ 


1, The quadratic forms 
m—1 m—1 


Sn (x, z)= DS) s400,, SOx, ec) =D) Sipe rstit (112) 


t, ka0 f, k= 


are posttwe definite; and 


2. (For n=2m+1) 
(81 >0. (113) 


Here s_3, So, $1, ..., Som—1 are the coefficients of the expansion 


78 We do not mention the inequality A,,~ 0 expressly, because it follows automatically 
from the conditions of the theorem. For if f(z) is a Hurwitz polynomial, then it is known 
that 4, #0. But if the conditions 1., 2. are given, then the fact that the form 8) (a, x) 
is positive definite implies that 


<FT FO ren e=m, 


and from this it follows that the fraction ug(u)/h(%) is reduced, which can be expressed 
by the inequality 4, ~ 0. 

In exactly the same way, it follows automatically from the conditions of the theorem 
that Dm =| 842 ("—1 40, ie, that the numbers %, 8, ..., Sem—1, and (for n== 2m + 1) 
8.1 are the Markov parameters of f(z). 
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We introduce a notation for the determinants 
Dy=\8iieh >, DO leer? (p=1, 2,4... m). (114) 


Then condition 1. is equivalent to the system of determinantal inequalities 


bar Sy Gar Baey 
| 88 8 Bicend 
Dy, =%>0, Dy=|"° *)>0,..., Dn =| * 7° ™ | >9, 
8} 8&3 eo 2e «© @# @ @ o 
Be Bec Bayi 
| ‘m—1 8m m—2 (115) 
8, 8g Sr 
| 84 8 Sq & -.-8 
DO SRS 0, DOH | SO DO S| mr 1>0. 
89 83 e e oe @ @ ° 
8m Sm+1 $am—1 


If n=2m, the inequalities (115) determine the domain of stability in 
the space of Markov parameters. If n= 2m + 1, we have to add the further 
inequality : 

84 > 0. (116) 

In the next section we shall find out what properties of S follow from 
the inequalities (115) and, in so doing, shall single out the special class of 
infinite Hankel matrices § that correspond to Hurwitz polynomials. 


§ 16. Connection with the Problem of Moments 


1. We begin by stating the following problem: 


PROBLEM or MoMENTS FOR THE POSITIVE Axis O< vu < o: Given a 
SEQUENCE So, Si, ... Of real numbers, it 1s required to determine postive 
numbers 


fy > 0, Me>0,..., MnO, O<y <<<, (117) 


such that the following equations hold: 
m 
8» = 2) (p=0, 1, 2,...). (118) 
om} 


It. is not difficult to see that the system (118) of equations is equivalent 
to the following expansion in a series of negative powers of u: 


78 This problem of moments ought to be called discrete in contrast to the general expo- 


: ™ 

nential problem of moments, in which the sums & pyre are replaced by Stieltjes integrals 
oo jal 

fv? due) (see [55]). 

0 
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ee 
In this case the infinite Hankel matrix $= || s+, ||” is of finite rank m 
and by (117) in the irreducible proper fraction 
gu) os 
h (u) = Sty (120) 


(we choose the highest coefficients of h(u) and g(u) to be positive) the 
polynomials h(u) and g(u) form a positive pair (see (91) and (91’)). 

Therefore (see Theorem 14), our problem of moments has a solution if 
and only if the sequence So, $:, Se, ... determines by means of (119) and 
(120) a Hurwitz polynomial f(z) =h(2*) + zg(z*) of degree 2m. 

The solution of the problem of moments is unique, because the positive 
numbers v,; and pw, (j = 1, 2,..., m) are uniquely determined from the expan- 
sion (119). 

Apart from the ‘infinite’ problem of moments (118) we also consider 
the ‘finite’ problem of moments given by the first 2m equations (118) 


s= Surv; (p=0,1,..., 2m—1). (121) 
ful 
These relations already determine the following expressions for the Hankel 
quadratic forms: 


m—-1 m 
‘ ~, 85440, Ly = Hy (%q + wp toe + mf), 
= =o (122) 


m—-1 m 
es m—1 
, 8542410 ;%;y — a Hi (y+ 40; cee Cn— Yj 3. 


Since the linear forms in the variables, %o, 21, ..., Zm—1 


m1 


Bot ayy +o + Ly; (j=1,2,..., m) 


are independent (their coefficients form a non-vanishing Vandermonde 
determinant), the quadratic forms (122) are positive definite. But then 
by Theorem 17 the numbers So, s;, ..., Som—1 are the Markov parameters of 
a certain Hurwitz polynomial f(z). They are the first 2m coefficients of 
the expansion (119). Together with the remaining coefficients Som, Som41,.-- 
they determine the infinite solvable problem of moments (118), which has 
the same solution as the finite problem (121). 


Thus we have proved the following theorem: 
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THEOREM 18: 1) The finite problem of moments 


p= 2 HY} (123) 

(p=0,1,...,2m—1; 41 > 0,..., um > 0;0 < v1 < v2 <<... < Um), where 
Sp are given real numbers and vj and pw; are unknown real numbers 
(p=0, 1,..., 2m—1; j= 1, 2,..., m) has a solution tf and only tf the 
quadratic forms 

m—1 m—1 

oy Si eT hy, Dy Sepns rie (124) 

i,k=0 i,k—=0 


are positive definite, t.e., if the numbers So, $1, ...5 Sam—1 are the Markov 
parameters of some Hurwitz polynomial of degree 2m. 


2) The infinite problem of moments 
8p =2 ye; (125) 


(p=0,1, 2,... 541 > 0,..., um > 0;0 << 01 << V2 <<... < Um), where sy are 
given real numbers and v; and wu; are unknown real numbers (p=0,1,... ; 
j=1, 2,..., m) has a solution if and only rf 1. the quadratic forms (124) 
are positive definite and 2. the infinite Hankel matriz S= || s:4x ||° ws of 
rank m, %.6., if the series 


fa (126) 


determines a Hurwitz polynomial f(z) = h(2?) =2g(2?) of degree 2m. 
3) Fhe solution of the problem of moments, both the finite (123) and 
the infimte (124) problem, is always unique. 


2. We shall use this theorem in investigating the minors of an infinite 
Hankel matrix S= || s.,x ||° of rank m corresponding to some Hurwitz 
polynomial, i.e., one for which the quadratic form (124) is positive definite. 
In this case the generating numbers 5», $s), Se, ... of S can be represented in 
the form (123), so that for an arbitrary minor of S of order h = m we have: 


k k 
vy vy" 
ky kh 
ra 4 t; v2 V2 
Ss thy °°" Sit kp Hy? Have PmYm . 
iA i ih ee @ eeeoe¢ 
Sintk, °° * Sink, My, Hata --- Umm Brea anne 
ky kh 
Um Vi, 
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239 
and therefore 
ry ry oo & 4 
1 .°3 A 
° 
k, ke eee k, ‘, a 3 
1 45 k 
Va, Vas Vay | | Va, vs vat 
‘ f. ts 4, k 
a > Hee, ph 7 Va, Vas Vay, Va, vn on 
— eee Ol 
1g a <ayc-:- <aygm 1 Fay see ee ae eed (127) 
th th $f, k 
Va, Va, --- Vaz Vap ot eee ugh 


But from the inequalities 
O< vy <Ug<ee <4, ge < hk, ky <hg<--<k, 


it follows that the generalized Vandermonde deferminants®° 


4 otk 4; ky ky kh 
Va, Va, ° Vaz Va, Va, Va : 
te 4, 4, Ry hy (7 
Va, Va, Va, > iy Va, Va, Va, > 0 
+) 
th th th k Ez, 
Va, Vay Van Va} vat va, 


are positive. 
Since the numbers y; are positive (j =1, 2,..., m), it thererore follows 
from (127) that 


t, te ... 14 ty << dg er 2 

s(; ; ‘)>0 (os pe * h=1,2,...,m). 12 
k, ky... hy 1 < hg <<, ) 
Conversely, if in an infinite Hankel matrix 9 = | Si+x lle of rank m all 


the minors of every order h S m are positive, then the quadratic forms (124) 
are positive definite. 


DEFINITION 4: An infinite matriz A= | Qa 7 will b. called totally 
positive of rank m if and only if all the minors of A of order h =m are 
positive and ali the minors of order h > m are zero. 

The property of § that we have found can now be expressed in the fol- 
lowing theorem :*" 


THEorEM 19: An infinite Hankel matriz S= || s4% ||9 is totally posi- 
tive of rank m if and only tf 1) S is of rank m and 2) the quadratic forms 
m—1 m—1 
2 S440 hy » Pa 554-041 Uy 
k=O i, k=O 


are positive defumte. 


80 See p. 99, Example 1. 
81 See [173]. 
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From this theorem and Theorem 17 we obtain: 


THroreM 20: A real polynomial f(z) of degree n 1s a Hurwitz poly- 
nomial tf and only if the corresponding infinite Hankel matriz S = 1 Stak | by 
is totally positwe of degree m=[n/2] and tf, in addition, s_,; > 0 when 
m ts odd. 


Here the elements 80, 81, 82, ... of S and s_, are determined by the 
expansion , 
(u 8 8 8 
fj=4+2-S+3-~. 29) 
where 


f(z) =h (24) + 2g (z*). 


§ 17. Theorems of Markov and Chebyshev 


1. In a notable memoir ‘On functions obtained by converting series into 
continued fractions’*? Markov proved two theorems, the second of which had 
been established in 1892 by Chebyshev by other methods, and not in the 
same generality. 

In this section we shall show that these theorems have an immediate bear- 
ing on the study of the domain of stability in the Markov parameters and 
shall give a comparatively simple proof (without reference to continued 
fractions) which is based on Theorem 19 of the preceding section. 

In proceeding to state the first theorem, we quote the corresponding 
passage from the above-mentioned memoir of Markov :** 


On the basis of what has preceded it is not difficult to prove 
two remarkable theorems with which we conclude our paper. 
One is concerned with the determinants*® 


A,,4;, ada A; AM), A®, eoery Am) 
and the other with the roots of the equation®® 
Wm (x) = 0. 


82 Zap. Petersburg Akad. Nauk, Petersburg, 1894 [in Russian]; see also [38], 
pp. 78-105. 


83 This theorem was first published in Chebyshev’s paper ‘On the expansion in con- 
tinued fractions of series in descending powers of the variable’ [in Russian]. See [8], 
pp. 307-62. 

84 [$8], p. 95, beginning with line 3 from below. 

85 In our notation, Di, Ds, ..., Dm, po, Do), salary p®, (See p. 236.) 

86 In our notation; A(—*2) = 0. 


§ 17. THEOREMS OF Markov AND CHEBYSHEV 4] 
THEOREM ON DETERMINANTS: If we have for the numbers 
So, 84, 8g5 -- +, Som-22 Som—1 
two sets of values 


1. 89 =Gq, 81 =A, Sg —Aq, ~~~» Sam _2 = Aam_2> S2m—1 = Fom—rs 


2. Sq = by, 8, =), 89 = bs, ae a 8om—2 = Domo: Som—1 = Dom —1 


for which all the determinants 


89 $1 +++ 8m 
8 8, 8y 82 8m 
4, =%,4, => er toed , 
8; 82 
S8m—1 Sm Sam—2 
&, 85 8m 
8; 8 
AY = 8, A® =— 1 2 gia deags A@) — So Ss eee Santi 
8> 83 ee#ev508feeee#e# @ @ 


3m Smit Sem—1 
turn out to be positive numbers satisfying the inequalities 
Ay = by, by S ay, Ay = by, bg 2 Ag, «- +s Gem 9 = Somos bam 1 = Tom—1) 


then our determinant 
Ay, 4g, +++ Ans 4%, A, 2.., AM 
must be positive for all values 
89; 84> 8g, +0 +) Som 
satisfying the mequalities 


Ag = 89 = bo, 6, 28; [jay, Ag =] 8g = de, .. 
Don 2 = Som —2 = Oem 2s Dems = Sen—1 = Lom—1° 


Under these conditions we have 


a ay Ay1 89 8y +++ Oy_y bo by... by 
ay a, % {5/41 Sa Se OS by by... b, 
Ay_y ay eee Aoy_9 $p—1 8, eon 8or_-9 by_4 b, soe bop_s 


and 
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b, bs b se | a | a 4, a, 
bg b, » Oy > 82 8g «-- Spay > Gq Gy oes Ayry 
b, Oras ease boy 1 8 } 8 k+ 1 eee Sop a, Qp44 eee Aoy_4 


for k=1, 2,..., m. 


In order to give another statement of this theorem in connection with 
the problem of stability, we introduce some concepts and notations. 

The Markov parameters 8, 81, ..., Som—1 (for m= 2m) or S_4, So, 81, .--; 
Som—1 (for n= 2m+1) will be regarded as the coordinates of some point P 
in an n-dimensional space. The domain of stability in this space will be 
denoted by G. The domain G is characterized by the inequalities (115) and 
(116) <(p. 236). 

We shall say that a point P= {s,} ‘precedes’ a point P* = {s,*} and 
shall write P < P* if 


* * * Ls * 
89 S8q, $1 S 84, 8g S8y, 839 Sg, ---, 8am—1 S S2m-1 


and (for n= 2m+ 1) . (130) 
6_1 S82 


and the sign < holds in at least one of these relations. 
If only the relations (130) hold, without the last clause, then we shall 


write:. 


Px Pp. 


We shall say that a point @ lies ‘between’ P and R if P ~ Q ~ R. 
To every point P there corresponds an infinite Hankel matrix of rank 
m: S= || si4x ||. We shall denote this matrix by Sy. 


Now we can state Markov’s theorem in the following way : 


THEOREM 21 (Markov): If two points P and R belong to the domain of 
Sability G and tf P precedes R, then every point Q between P and R also 
belongs to G, +.e., 


from P,ReG,PX< Q~< R tt follows that Q« G. 


Proof, From P < Q =X R it follows that P and Q ean be connected 
by an arc of a curve 


8, =(—1)' 9, (¢) [e StS y31=0,1,...,2m—1 and (forn= 2m +4 1)i=—1) (131) 
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passing through Q such that: 1) the functions g,(¢) are continuous, mono- 
tonic increasing, and differentiable when ¢ varies from ¢=a to t= y; and 
2) the values a, 8, y (a < 8 < y) of ¢ correspond to the points P, Q, R on 
the curve. 

From the values (131) we form the infinite Hankel matrix S= S(t) = 
| S44%(t) le of rank m. We consider part of this matrix, namely the rec- 
tangular matrix 


$1 Fa tt Sm Smt | (132) 


8n—1 8x, aes S8om—2 Som—1 


By the conditions of the theorem, the matrix S(¢) is totally positive of 
rank m for t =a and t= y, so that all the minors of (132) of order p= 1, 2, 
3,..., m are positive. 

We shall now show that this property also holds for every intermediate 
value of ¢ (a << t<y). 

For p= 1, this is obvious. Let us prove the statement for the minors 
of order p, on the assumption that it is true for those of order p-—-1. We 
consider an arbitrary minor of order p formed from successive rows and 
columns of (132) : 


89 Sgt +++ Sotp- 
D® = Sgi41 8 g40 +++ Sat (¢=0, 1, ..., 2(m—- p) +1]. (133) 


We compute the derivative of this minor 


1 
£ D® = 'y apy bebe 


134 
ito Sisters us 


a D® 


5 (4,k=0,1,..., p—1) are the algebraic complements (adjoints) 
a 


of the elements of dae: Since by assumption all] the minors of this determi- 
nant are positive, we have 


(@) 


ap 
gaa! (3, k= 0, 1, ve ey p—1l). (135) 


On the other hand, we find from (131) : 
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(aiyertre Masse _ Werte 59 (i, k=0,1,... , p—1). (136 
From (134), (135), and (136) it follows that 


q=0,1,... ,2(m—p)+1, 
(19 ZD@z0 | p= 1,2... 0m, 37) 
‘astsy 


Thus, when the argument increases from t = a, to = y, then every minor 
(133) with even g is a monotone non-decreasing function and with odd q 
is a monotone non-increasing function; but since the minor is positive for 

=a and t=y, it is also positive for every intermediate value of ft 
(a<t<y). ; 

From the fact that the minors of (132) of order p — 1 and those of order 
p that are formed from successive rows and columns are positive, it now 
follows that ail the minors of (131) of order p are positive.*’ 

What we have proved implies that for every t (a= t= y) the values 
So, $1, -.+,) Som—1 and (for n=2m+1) s_, satisfy the inequalities (115) 
and (116), i.e., that for every ¢ these values are the Markov parameters of 
a certain Hurwitz polynomial. In other words, the whole are (131) and, 
in particular, the point Q lies in the domain of stability G. 

This completes the proof of Markov’s Theorem. 


Note. Since we have proved that every point of the are (131) belongs 
to G, the values of (131) for every ¢ (a2 Sty) determine a totally posi- 
tive matrix S(t) = || s42(¢) IPs of rank m. Therefore the inequalities (135) 
and consequently (137) as well hold for every ¢ (a Sty), ie., with in- 
creasing ¢ every D® increases for even g and decreases for odd g (q=0, 1, 
2,..-,2(m—p) +1; p=1,...,m). In other words, from PX Q <,R 
it follows that y 


(— 1) DP) s (— 1)? DQ) = (— 1)" DR) 
(q=—0, 1, ..., 2(m—p)+1; p=1, ..., m). 


These inequalities for g=0, 1 give Markov’s inequalities (pp. 241). 
We now come to the Chebyshev-Markov theorem mentioned at the be- 
ginning of this section. Again we quote from Markov’s memoir :®* 


87 This follows from Fekete’s determinant indentity (see [17], pp. 306-7). 
58 See [38], p. 103, beginning with line 5. 
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THEOREM ON Roots: If the numbers 


Bo, Ay, Bg, ~~ +5 Aon 9, Ban _4) 
Sq, 31, Sq, ---; Som—a> Sam—1> 
by, b;, be, eo 


satisfy all the conditions of the preceding theorem,” then the 
equations 


2 Dem—2s bem—1 


2 = 
Gg Gg +++ Amiy & 0, 
am Omt1 : 2am—1 il 
l 
89 8 aw m—1 ] 
83 ¥q 8, x 
8, 83 S24 a? 9, 
mr 
Sm Sm4+1 Som—1 v 


ee e ee 86 @ # *# oe @# 


of degree m in the unknown x do not have multiple or imaginary 
or negative roots. 


And the roots of the second equation are larger than the cor- 
responding roots of the first equation and smaller than the cor- 
responding roots of the last equation. 


Let us find out the connection of this theorem with the domain of sta- 


bility in the space of the Markov parameters. Setting f(z) =h(z?) + z2g(2?) 
and 


h (— 0) = C90 + Cu" 2 + 00+ Cm (C90), 


we obtain from the expansion (105) 


the identity 


89 He refers to the preceding theorem, Markov’s theorem on determinants (pp. 241). 
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—g(—v) =(—s.+ 2 + “ a +) (cov™ + cyu™-1+-°-+ ,,). 
Equating to zero the coefficients of the powers v—*, v—?,..., u-™, we find : 


89Cm +84Cmy +t tSnto =9, 


810m Salm Bae a 8m+1 Sy — 9, (138) 


eo 8 @© ee @© @ ee © e&© © 8 @@¢ @ @e@ @  @ 


8m—1%m + 8 7 ln—1 + ea +- Sam—1% —0O; 


to these relations we add the equation 


h(—v)=0, (139) 

written as 

Cm + Vly, y H2°° + Uy = 0. (139’) 
Eliminating from (138) and (139’) the coefficients co, c1, ..., Cm, We repre- 
sent the equation (139) in the form 

ae ce ee | 

Gi Ga? eb8 es v 

Sg 83 oe 0 Om} v = 0. (139”) 


La] 
8m S8mi1°++ Sam—-1  ¥ 


Thus, the algebraic equation in the Chebyshev-Markov theorem coincides 
with (139) and the inequalities imposed on 50, 51, ..., Sem—1 coincide with 
the inequalities (115) that determine the domain of stability in the space 
of the Markov parameters. 

The Chebyshev-Markov theorem shows how the roots w= — 1, 
Ug = — Vo, ..., Um = — Um Of h(u) change when the corresponding Markov 
parameters So, $1, ..., Sem—1 Vary in the domain of stability. 

The first part of the theorem states something we already know: When 
the inequalities (115) are satisfied, then all the roots w1, Ue, ..., Um of h(x) 
are simple, real, and negative.°° We denote them as follows: 


ti(P), w2(P),..., Um(P), 


where P is the corresponding point of G. 
The second (fundamental) part of the Chebyshev-Markov theorem can 


be stated as follows: 


90 See Theorem 13, on p. 228. 
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THroREM 22 (Chebyshev-Markov): If P and Q are two points of G and 
P ‘precedes’ Q, 
P<Q, (140) 
then™ 


us(P) < u1(Q), u2(P) < u2(Q),..-, Um(P) <um(@). (141) 


Proof. The coefficients of h(u) can be expressed rationally in terms of 
the parameters So, 81, ..., Son—1°2 Then 


h(u)=0 (i=1, 2, ..., m) 


implies that :*? 
h 


ee +h 4) GE=0 G= 1), 2y.4%; m; 1=0,1,..., 2m—1). (142) 


On the other hand, when we differentiate the expansion 


g (uw) __ & & , & 
h(u) 8a te ut + 3 
term by term with respect to s, we find: 
h 
“a (a) = Tan + yimti (*). (143) 


Multiplying both sides of this equation by a and denoting the coeffi- 


cient of uw! in this polynomial by Cy, we obtain: 


g (u) dh (x) 
h(u) dg (u) ds, _ (—1)'Cy 
u— ly 08; — Uu— U; —~ U ee ae (144) 


Comparing the coefficients of 1/u (the residues) on the two sides of (144), 
we find: 
a Oh (us) 
(— 1)'-* g (u,) 43 = Cu, (145) 
l 
which gives in conjunction with (142) : 
dy (—1)Gy 
ds,” g (ui) h’ (ui) © 
91 In other words, the roots uw, uz, ..., um increase with increasing So, Sz, ..., Sem—2 and 


with decreasing &1, %, ..., 8am—1, 
92 For example, by the equations (138) if, for simplicity, we set co—1 in these equations 


oh ae } = (2 (w) fase 


93 Here 


248 XV. Tne PrRoBLeM or RoutH-Hurwitz anp RELATED QUESTIONS 


Introducing the va'ues 


R,= art (I=1, 2, ..., m), (146) 


we obtain the formula of Chebyshev-Markov : 


du; (— Viz: 


Tn hiv | SH ..,m;, 1=0,1,...,2m—1). (147) 


But in the domain of stability the values R, (1=1, 2,..., m) are positive 
‘(see (90’) on p. 226). The same can be said of the coefficients Cy. For 

PO) 2 (wt my) es (wr ein) (w+ 0) (UH Daya) (UE MR), (148) 
where 


y,=—-u>oO (=1, 2, ..., m), 

AB (ua) 
— Wi 

in powers of u are positive. Thus, we obtain from the Chebyshev. Markov 

formula: 


From (148) :it is clear that all the coefficients Cy in the expansion of = 


(— iy ae oe 0. (149) 


In the proof of Markov’s theorem we have shown that any two points 
P ~< Q of G can be joined by an are s;= (—-1)’gi (t) (J=0,1,...,2m—1), 
where g(t) is a monotonic increasing differentiable function of ¢ (¢ varies 
within the limits a and 6 (a < #) and t= a corresponds to P, t= 8 to Q). 
Then along this arc we have, by (149) :** 


=3 M4 He 5 0, oH £0 (@StS 8). (150) 


Hence by integrating we obtain: 
Us eay = Ui (P) < Uigig =U (Q) @Q= 1, 2). 2245-97). 


This completes the proof of the Chebyshev-Markov theorem. 


§ 18. The Generalized Routh-Hurwitz Problem 


]. In this section we shall give a rule to determine the number of roots in the 
right half-plane of a polynomial f(z) with complex coefficients. 


ay 
®4 Since (— 1)? z= +2 20 (ests pb) and for at least one J there exist values 


de 
of ¢ for which (— 1)! = > 0. 
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Suppose that 
f (tz) = ge” + by2"-2 + ++ 4+ b, +4 (agz” + ay2"-14----+-,), (151) 
where do, @, ..-, Gn, Bo, b1, ..., bn are real numbers. If the degree of f(z) 
is n, then 05 + ta)340. Without loss of generality we may assume that 
Qo #0 (otherwise we could replace f(z) by tf(z)). 
We shall assume that the real polynomials 
G2” +a,z™-1+-->+a, and bg” + b,27-1 +--+, (152) 


are co-prime, i.e., that their resultant does not vanish :*° 


Ag Gy -+- Ay 0 ....90 
by 5, b,, 0 ... 90 
0 a a a, .-- 0 
Vo. + 0 hs ~ 0. (153) 
"1 0 By .-+ bn By vee 0 


ee e@ @ @  * @  @  @  @  @  @® © 


Hence it follows, in particular, that the polynomials (152) have no roots in 
common and that, therefore, f(z) has no roots on the imaginary axis. 

We denote by & the number of roots of f(z) with positive real parts. 
By considering the domain in the right half-plane bounded by the imaginary 
axis and the semi-circle of radius R (R— «) and by repeating verbatim 
the arguments used on p. 177 for the real polynomial f(z), we obtain the 
formula for the increment of arg f(z) along the imaginary axis 


At@= arg f(z) = (n — 2k) x. (154) 
Hence we obtain, by (151), in view of dp» 54 0: 


ee ian Oe oe 
oer ares a as ~ 


Using Theorem 10 of § 11 (p. 215), we now obtain: 


R= V1 Veg V go cess Ve), (156) 
where 


65 7, is a determinant of order 2n. 
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V p= QO ay --+ Aap_2 (p=1, 2, ..., n; 4=b,=0 fork>n). (157) 


«*. e ee #¢ e@® oe @ 


x. ee ee @® @®  &  ¢ 


We have thus arrived at the following theorem. 


THEOREM 23: If a complex polynomial f(z) is given for which 
f (iz) = bye” + By2™ 1 + +++ +b, +t (agz” + az") + +++ + G,) (4) 0) 


and if the polynomials agz" +... +a, and bez" +...+ 0, are co-prime 
(Vo, 0), then the number of roots of f(z) im the right half-plane ts deter- 
mined by the formulas (156) and (157). 
Moreover, if some of the determinants (157) vanish, then for each group 
of successive zeros 
(V 9,340) Vansa="+°=V on: 0p =9 (Venrens2 FO) (158) 
in the calculation of V(1,V2,V4,---»Ven) we must set: 


sign V 2442; =(—1) signV,, (j=1, 2, ..., p) (159) 


or, what is the same, 


V (Van Ventas --+s Versops V onrapss) 
es for odd p, 


p+i— 
2 


; p (160) 
for even p and e=(—1)? sign 


V ahtap+s 
Vor 
We leave it to the reader to verify that in the special case where f(z) 

is a real polynomial we can obtain the Routh-Hurwitz theorem (see § 6) 

from Theorem 23. 

In conclusion, we mention that in this chapter we have dealt with the 
application of quadratic forms (in particular, Hankel forms) to one problem 
of the disposition of the roots of a polynomial in the complex plane. Quad- 
ratic and hermitian forms also have interesting applications to other prob- 
lems of the disposition of roots. We refer the reader who is interested in 
these questions to the survey, already quoted, of M. G. Krein and M. A. 
Naimark ‘The method of symmetric and hermitian forms in the theory 
of separation of roots of algebraic equations,’ (Kharkov, 1936). 


96 Suitable algorithms for the solution of the generalized Routh-Hurwitz problem zan 
be found in the monograph [41] and in the paper [39]. See also [7] and [37], 
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Angle between vectors, 242 
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reduction to principal axes, 309 
restricted, 306. . 
semidefinite, 304 
signature of, 296, 298 
singular, 294 
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Hermite-Biehler theorem, 228 
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Ince, 147 

Inertia, law of, 297, 334 
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product, 132 . 

Invariant plane, of operator, 283 


Jacosl, formula of, 302, 336 
identity of, 114 
method of, 300 
theorem of, 303 
Jacobi matrix, 99 
Jordan basis, 201 
Jordan block, 151 
Jordan chains of columns, 165 
Jordan form of matrix, 152, 201, 202 
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KARPELEVICH, 87 
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Kolmogorov, 83, 87, 92 
Kotelyanskii, 103 
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Krein, 221, 250 
Kronecker, 75; 25, 37. 40 
Krylov, 203 
transformation of, 206 


LAGRANGE, method of, 299 
Lagrange interpolation polynomial, 10 
Lagrange-Sylvester interpolation pol: 
mia], ©” 
A-matrix, 150 
kernel of. 39 


Lappo-Danilevskii, 168, 170, 171 
Left value, 81 
Legendre polynomials, 258 
Liénard, 173, 221 
Liénard-Chipart stability ériterion:: 221 
Limit of sequence of matrices, 33 
Linear (in)dependence of vectors, 51 
Linear transformation, 3 
‘Logarithm of matrix, 239 
Lyapunov, 173, 185 
criterion of, 120 
equivalence in the sense of, 118 
theorem of, 187 
Lyapunov matrix, 117 
Lyapunov transformation, 117 


MacMitian, 115 
Mapping, affine, 245 
Markov, 173, 240 
theorem of, 242 
Markov chain, acyclic, 8& 
cyclic, 88 
fully regular, 88 
homogeneous, 83 
period of, 96 
(ir) reducible; 38 
regular, 88 
Markov parameters, 283, 234 
Matricant, 127 
Matrices, addition of, 4 
group property, 18 
annihilating polynomial of, 89 


applications to differential equations, 


116ff. 
congruence of, 296 
difference of, 5 
equivalence of, 132, 133 
equivalent, 61ff. 
left-equivalence of, 132, 133 
limit of sequence of, 33 
multiplication on left by H, 14 
product of, 6 
quotient of, 17 
rank of product, 12 
similarity of, 67 
unitary similarity of, 242 
with same real part of spectrum, 122 
adjoint, 82, 266 
reduced, 90 
blocks of, 41 
canonical form of, 63, 135, 136, 139, 141, 
152, 192, 201, 202, 264, 265 
cells of, 41 
characteristic, 82 
characteristic polynomial of, 82 


Matrix, column, 2 


commuting, 7 
companion, 149 
completely reducible, 81 
complex, iff. 
orthogonal], normal form of, 23 
representation of as product, 6 
skew-symmetric, normal form of, 18 
symmetric, normal form of, 11 
components of, 105 
compound, 19ff., 20 
computation of power of, 109 
constituent, 105 
of coordinate transformation, 60 
cyclic form of, 54 
decomposition into triangular factors, 
33ff. 
derivative of, 117 
determinant of, 1, 5 
diagonal, 3 
multiplication by, 8 
diagonal form of, 152 
dimension of, 1 
elementary, 132 
elementary divisors of, 142, 144, 194 
elements of, 1 
function of, 95ff. 
defined on spectrum, 96 
fundamental, 73 
Gaussian form of, 39 
Hankel, 338; 205 
projective, 20 
Hurwitz, 190 
idempotent, 226 
infinite, rank of, 239 
integral, 126; 113 
normalized, 114 
invariant polynomials of, 139, 144, 194 
inverse of, 15 
minors of, 19ff. 
irreducible, 50 
(im ) primitive, 80 
Jacobi, 99 
Jordan form of, 152, 201, 202 
A,, 130 
and linear operator, 56 
logarithm of, 239 
Lyapunoy, 117 
minimal polynomial of, 89 
uniqueness of, 90 
minor of, 2 
principal, 2 
multiplication of, by number, 5 
by matrix, 17 
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Matrix, nilpotent, 226 Matrix equations, 215ff. 


non-negative, 50 
totally, 98 
non-singular, 15 
normal, 269 
normal form of, 150, 192, 201, 202 
notation for, 1 
order of, 1 
orthogonal, 263 
oscillatory, 103 
partitioned, 41, 42 
permutable, 7 
permutation of, 50 
polynomial, see polynomial matrix 
polynomials in, permutability of, 13 
positive, 50 
spectra of, 58 6 
totally, 98 
power of, 12 
computation of, 109 
power séries in, 113 
principal minor of, 2 
quasi-triangular, 43 
rank of, 2 
reducible, 50, 51 
normal form of, 75 
representation as product, 264 
root of non-singular, 233 
root of singular, 234ff., 239 
Routh, 191 
row, 2 
of simple structure, 73 
singular, 15 
skew-symmetric, 19 
square, 1 
square root of, 239 
stochastic, 8$ 
fully regular, 88 
regular, 88 
spur of, 87 
subdiagonal of, 13 
superdiagonal of, 13 
symmetric, 19 
trace of, 87 
transformation of coordinate, 60 
transforming, 35, 60 
transpose of, 19 
triangular, 18, 218; 155 
unit, 12 
unitary, 263, 269 


unitary, representation of as product, 5 


upper quasi-triangular, 43 
upper triangular, 18 
Matrix addition, properties of, 4 


uniqueness of solution, 16 
Matrix multiplication, 6, 7 
Matrix polynomials, 76 

left quotient of, 78 

multiplication of, 77 
Maxwell, 172 
Mean, convergence in, of series, 260 
Metrie, 242 

euclidean, 245 

hermitian, 243, 244 

poz.tive definite, 243 

positive semidefinite, 243 
Minimal indices for columns, 38 
Minor, 2 

almost principal, 102 

of zero density, 104 
Modulus, left, 275 
Moments, problem of, 286, 287 
Motion, of mechanical system, 125 

of point, 121 

stability of, 125 

asymptotic, 125 


NaIMaRK, 221, 238, 250 
Nilpotency, index of, 226 
Norn, left, 275 

of vector, 243 
Null veetor, 52 
Nullity of vector space, 64 
Number space, n-dimensional, 52 


OPERATIONS, elementary, 134 
Operator (linear), 55, 66 
adjoint, 265 
decomposition of, 281 
hermitian, 268 
positive definite, 274 
positive semidefinite, 274 
projective, 20 
spectrum of, 272 
identity, 66 
invariant plane of, 283 
matrix corresponding to, 56 
normal, 268 
positive definite, 280 
positive semidefinite, 280 
normal, 280 
orthogonal, of first kind, 281 
(im ) proper, 281 
of second kind, 281 
polar decomposition of, 276, 286 
real, 282 
semidefinite, 274, 280 
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Operator (linear), of simple structure, 72 Polynomial matrix, 76, 130 


skew-symmetric, 280 

square root of, 275 

symmetric, 280 

transposed, 280 

unitary, 268 

‘ gpectrum of, 273 
Operators, addition of, 57 

multiplictaion of, 58 
Order of matrix, 1 
Orlando, formula of, 196 
Orthogonal complement, 266 
Orthogonalization, 256 
Oscillations, small, of system, 326 


PARAMETERS, homogeneous, 26 
Markov, 288, 234 . 

Parseval, equality of, 261 

Peano, 127 

Pencil of hermitian forms, 338 
characteristic equation of, 338 
characteristic values of, 338 
principal vector of, 338 


Pencil(s) of matrices, canonical form of, 


87, 39 
congruent, 41 
elementary divisors of, infinite, 27 
rank of, 29 
regular, 25 
singular, 25 
strict equivalence of, 24 
Pencil of quadratic forms, 310 
characteristic equation of, 310 
characteristic value of, 310 
principal column of, 310 
principal matrix of, 312 
principal vector of, 310 
Period, of Markov chain, 96 
Permutation of matrix, 50 
Perron, &3 
formula of, 116 
Petrovskii, 113 
Polynomial(s), annihilating, 176, 177 
minimal, 176 
of square matrix, 89 
of Chebyshev, 259 
characteristic, 71 
interpolation, 97, 101, 103 
invariant, 139, 144, 194 
of Legendre, 258 
matrix, see matrix polynomials 
minimal, 89, 176, 177 
monic, 176 
scalar, 76 
positive pair of, 227 


elementary operations on, 130, 131 
regular, 76 
order of, 76 
Power of matrix, 12 
Probability, absolute, 93 
limiting, 94 
mean limiting, 96 
transition, 82 
final, 88 
limiting, 88 
mean limiting, 96 
Product, inner, of vectors, 243 
scalar, of vectors, 242, 243 
of operators, 58 
of sequences, 6 
Pythagoras, theorem of, 244 


QUASI-ERGODIC THEOREM, 95 
Quasi-triangular matrix, 43 
Quotients of matrices, 17 


RANK, of infinite matrix, 239 
of matrix, 2 
of pencil, 29 
of vector space, 64 
Relative concepts, 184 
Right value, 81 
Ring, 17 
Romanovskii, 83 
Root of matrix, 233, 234ff., 239 
Rotation of space, 287 
Routh, 173, 201 
criterion of, 180 
Routh-Hurwitz, criterion of, 194 
Routh matrix, 191 
Routh scheme, 179 
Row matrix, 2 


SCHLESINGER, 133 
Schur, formulas of, 46 
Schwarz, inequality of, 255 
Sequence of vectors, 256, 260 
Series, convergence of, 260 
fundamental, of solutions, 38 
Signature of quadratic form, 296, 298 
Similarity of matrices, 67 
Singularity, 143 
Smirnov, 171 
Space, coefficient, 282 
decomposition of, 177, 248 
dilatation of, 287 
euclidean, 242, 245 
extension of, to unitary space, 282 
factor, 183 
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Space, rotation of, 287 UNIT SUM OF SQUARES, 314 
unitary, 242, 243 Unit sphere, 315 
as extension of euclidean, 282 Unit vector, 244 
Spectrum, 96, 272, 273; 53 
Spur, 87 VaLUE(s), characteristic, maximal, 53 
Square(s), independent, 297 extremal properties of, 317 
positive, 334 latent, 69 
Stability, criterion of, 221 left and right, of function, 81 
domain of, 232 proper, 69 
of motion, 125 Vector(s), 51 
of solnution of linear system, 129 angle between, 242 
States, essential, 92 bundle of, 183 
limiting, 92 Jordan chain of, 202 
non-essential, 92 complex, 282 
Stieltjes, theorem of, 232 congruence of, 181 
Stodol, 173 extremal, 55 
Sturm, theorem of, 175 inner product of, 243 
Sturm chain, 175 Jordan chain of, 201 
generalized, 176 latent, 69 
Subdiagonal, 13 length of, 242, 243 
Subspace, characteristic, 71 linear dependence of, 51 
coordinate, 51 test for, 251 
cyclic, 185 modulo 7, 183 
generated by vector, 185 linear independence of, 51 
invariant, 178 norm of, 243 
vector, 63 normalized, 244; 66 
Substitution, integral, 143, 169 null, 52 
Suleimanova, 87 orthogonal, 244, 248 
Superdiagonal, 13 orthogonalization of sequence, 256 
Sylvester, identity of, 32, 33 principal, 318, 338 
inequality of, 66 proper, 69 
Systems of differential equations, applica- projecting, 248 
tion of matrices to, 116ff. projection of, orthogonal, 248 
equivalent, 118 real, 282 
reducible, 118 scalar product of, 242, 243 
regular, 121, 168 systems of, bi-orthogonal, 267 
singularity of, 148 orthonormal, 245 
stability of solution, 129 unit, 244 
Systems of vectors, bi-orthogonal, 267 Vector space, 50ff., 51 
orthonormal, 245 basis of, 51 
defect of, 64 
TRACE, 87 dimension of, 51 
Transformation, lirear, 3 finite-dimensional, 51 
of coordinates, 59 infinite-dimensional, 51 
orthogonal, 242, 263 nullity of, 64 
unitary, 242, 263 rank of, 64 
written as matrix equation, 7 Vector, subspace, 63 
Lyapunov, 117 Volterra, 183, 145, 147 
Transforming matrix, 35, 60 Vyshnegradskii, 172 


Transpose, 19, 280 
Transposition, 18 WEIERSTRASS, 25 


